[vorbis] UTF8_LANG: a much better idea

Glenn Maynard g_ogg at zewt.org
Thu Jan 10 03:06:49 PST 2002



I've found a much better solution; it's standard (in Unicode itself), 
simple and more flexible: Unicode language tagging.  It was made for just
this purpose, in fact.

A technical description is at http://www.unicode.org/unicode/reports/tr27/#tag
which, like all specs, makes it sound a bit more complicated than it
really is.

It comes down to this: mark the language of text with U+E0001 LANGUAGE
TAG, followed by the RFC 3066 language ID (ie. "ja") encoded in
lowercase ASCII plus 0xE0000.

There's really nothing needed in the spec, except to 1: recommend its
use, and 2: define where language tags go out of scope (at the end of
the tag--that is, language tags shouldn't scope between tags.)

Programs which don't want to interpret this can simply ignore them;
they're zero-width, no-print characters intended to be ignorable.

This also eliminates the major restrictions of UTF8_LANG; you can change
language if you want, wherever you want.

(We were already able to use this, as it's part of Unicode; nobody would
use it if it wasn't explicitely recommended, though.)

We could even make vcomment automatically add some of these tags.  If
the local encoding is Shift-JIS, we're pretty safe adding the Japanese
tag.  (It might be English, or some other Roman language, but it's OK to
display those in a Japanese font.)  Of course, this should be optional,
though I don't see any problem with it defaulting to on.

"The use of Unicode language tags is encouraged in tag data.  Language
tags go out of scope at the end of each tag.  See
http://www.unicode.org/unicode/reports/tr27/#tag for a full description."

(This also means that we're back to all tags actually holding data that
the user is likely to care about directly.  UTF8_LANG is data that only
a program is likely to care about, and a program that simply displays
all tags is likely to make users go "what the heck is UTF8_LANG?")


-- 
Glenn Maynard

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.




More information about the Vorbis mailing list