[vorbis-dev] vorbis-tools reorganziation and UTF-8 stuff

Edmund GRIMLEY EVANS edmundo at rano.org
Tue Sep 25 03:48:06 PDT 2001



Stan Seibert:

> I have already fixed vorbiscomment to encode comments in UTF-8 (like oggenc
> already did), but I have not fixed ogg123 or oggenc to display UTF-8 comments
> in the native charset yet. This is because utf8.c does not have any decode
> functions, since they weren't needed. Since I would rather debug the
> kcarnold_branch ogg123 than figure out UTF-8, I'm throwing out this list of
> requests to the more motivated and/or knowledgable people out there:
 
Bruno Haible wrote a portable libiconv library that probably runs on
anything vorbis-tools runs on. You could just make people install
libiconv if their system doesn't provide an adequate version of iconv.

I know it's tempting to think "let's just include some simple charset
conversion functions so we have fewer dependencies", but you also have
to think about the duplication of effort, bloat and maintenance burden
that occurs when each program on a system includes its own set of
simple charset conversion functions.

So, I think you should seriously consider throwing out all the charset
conversion stuff from vorbis-tools and simply requiring iconv.

By the way, it's a good idea to feed data through iconv even when the
charset is the same: a decent version of iconv will validate UTF-8
data if told to convert from UTF-8 to UTF-8.

Also, do you want transliteration? You can get iconv to do approximate
conversion, e.g. convert 'ö' to '"o'. This is good for displaying
data, but there is a danger of the user not realising that the data
has been transliterated and converting it back to UTF-8, for example
when editing a comment in a non-UTF-8 locale.

You might want versions of

int utf8_decode(const char *from, char **to, const char *encoding)
int utf8_encode(const char *from, char **to, const char *encoding)

that return:

0 - data was exactly converted
1 - data was approximately converted
2 - input was invalid
3 - unknown encoding

Both functions could make an attempt at converting the data even when
there is an error, though you probably only want to use that facility
when converting the data for display, if at all.

Edmund

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list