[vorbis] UTF8_LANG: a much better idea
mdillon at standmed.com
Thu Jan 10 13:36:01 PST 2002
begin Glenn Maynard quotation:
> The original revision is in
> http://www.unicode.org/unicode/reports/tr7/tr7-1.html, from May 1998.
> It's trivial to strip for display, as well.
While this may be the case, that still requires recognizing those
characters as special, regardless of whether they are ignored or not.
Older Unicode implementation will do with those characters just as they
do with other unrecognized characters, which may result in '???' or some
such showing up on the screen (since we can't turn back time to make
such hypothetical implementations act differently). This may be a moot
point, if most implementations treat unrecognized characters as
zero-width and not affecting any other aspects of processing, but this
doesn't seem likely to me. If anyone knows otherwise, I'd be glad to
To quote from the referenced URL:
"The basic rules for Unicode conformance for the tag characters are
exactly the same as for any other Unicode characters. A conformant
process is not required to interpret the tag characters. If it does
interpret them, it should interpret them according to the standard, i.e.
as spelled-out tags. If it does not interpret tag characters, it should
leave their values undisturbed and do whatever it does with any other
So for a non-TagAware Unicode application, any language tag characters
(or any other kind of tag expressed with Plane 14 tag characters)
encountered would be handled exactly as for uninterpreted Tibetan from
the BMP, uninterpreted Linear B from Plane 1, or uninterpreted Egyptian
hieroglyphics from private use space in Plane 15.
A TagAware but TagPhobic Unicode application can recognize the tag
character range in Plane 14 and choose to deliberately strip them out
completely to produce plain text with no tags."
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Vorbis