[vorbis] tags in comment field - why?
rillian
rillian at telus.net
Sat Dec 29 20:07:16 PST 2001
On Thursday, December 27, 2001, at 05:01 PM, Glenn Maynard wrote:
> I said that ID3V2 fixed ID3V1's major limitations. I'm saying that we
> shouldn't use a format that has the major limitations the current
> proposals do, as to fix the limitations the format will need to be
> completely replaced.
Hmm. Maybe I missed some of this. The short answer here is that the
vorbis tag format was intended to be quick and informal, and is clearly
labelled as such. As is clear from the plethora of argument, doing
metadata properly is *hard* and a separate problem.
>> speak for Monty on the original decision but using XML is overkill.
>
> Why? XML is simpler than defining your own data format from the ground
> up.
We've always said we'd like to have a separate metadata format, one that
does it's best to be all things to all people. Where that belongs is in
a separate logical ogg bistream, mixed in with the vorbis data. I think
xml is great too, and we've had many arguments. What we really need is a
good sane implementation of something. :)
One route I've been pointing out for the past year is to just import the
MusicBrainz format. I think it needs some work from the design point of
view, but there's a ready implementation of the parser and an
established database to query.
Order switched for topicality:
> On Thu, Dec 27, 2001 at 01:56:45PM +0200, Beni Cherniavksy wrote:
>> Erh... Good point. That's a question to Unicode, though. Why did
>> they
>> do it this way? I thought there is single glyph per unicode character
>> but
>
> It's a matter of HAN unification; I don't understand the issue quite
> well enough to explain it, but you need to know the language of the text
> to know which font to use.
My (only moderately informed) understanding is that various languages
use a set of idiographic characters. They've all diverged somewhat
since, and so there are differences in the details of how some
characters are rendered, even if they're derived from the same root.
Analogies for roman script might be having the wrong diacritics for your
language on a bunch of the letters, or how someone used to blackletter
would have viewed a bible set in italic type. You can figure out what it
says, but it hardly qualifies as proper text display. The references
Glenn gave explain this pretty well.
> http://www.unicode.org/unicode/faq/han_cjk.html
> http://www.cs.ruu.nl/~otfried/Mule/unihan.html
The unicode people did things this way to try and reduce the number of
characters in the set. (They were originally trying--foolishly--to fit
everything into 64k) And because character variants that share a
codepoint have similar meanings, it helps a lot with, for example,
parsing and translation. Of course what's a stylistic difference and
what's a distinct character is subjective, so one can find plenty of
inconsistencies.
FWIW,
-r
<p>--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Vorbis
mailing list