[vorbis-dev] comment field proposal

Mark B. Elrod elrod at liquidmetal.com
Fri May 12 14:36:17 PDT 2000



I personally think it is a mistake to place any metadata in the comment section. It will be analogous to id3v1 and id3v2 tags on the mp3 side... even though people are moving to id3v2 we have to write out both to make sure we are readable by most players.  Metadata should wait until it can be done correctly through the XML stream definition.

elrod

Monty wrote:

> Well, this is a bit stronger than a proposal; this is "what I plan to do unless
> people see obvious flaws I missed"...
>
> The text comment header is the second (of three) header packets that begin a Vorbis bitstream.  It is meant for short, text comments, not arbitrary metadata; arbitrary metadata will be put in a metadata stream, likely an XML stream type.  We've discussed this in length-- several times :-)
>
> The comment header is a list of eight-bit-clean vectors; the number of vectors is bounded to 2^32 and the length of each vector is limited to 2^32 bytes.  The vector length is encoded; the vector is not null terminated.  In addition to the vector list, there is a single vector for vendor name (also 8 bit clean, length encoded in 32 bits).  Libvorbis currently sets the vendor string to "Xiphophorus libVorbis I 20000508"
>
> (note: although the vector space in the ogg format is 8 bit lean, libvorbis currently assumes during encoding that the comments submitted for encapsulation are C style strings)
>
> Libvorbis comments are 'unstructured', so it's time to impose a little
> convention before things get out of hand.  Given that the comments are meant
> for *simple*, *short* fields (think 'title', 'artist', etc), the structure
> should be simple.  I say we pattern this after a simple UNIX style environment
> array with common 'variable' names agreed upon ahead of time.
>
> That is, fields look like:
>
> comment[0]="ARTIST=me";
> comment[1]="TITLE=the sound of vorbis";
>
> For the sake of completeness, I'm proposing:
>
> A case-insensitive field name that may consist of ASCII 0x20 through 0x7D, 0x3D ('=') excluded.  ASCII 0x41 through 0x5A inclusive (A-Z) is to be considered equivalent to ASCII 0x61 through 0x7A inclusive (a-z).
>
> The field name is immediately followed by ascii 0x3D ('='); this equals sign is
> used to terminate the field name.
>
> 0x3D is followed by 8 bit clean field contents to the end of the field.
>
> Implications: field names should not be 'internationalized'; this is a
> concession to simplicity not an attempt to piss off the majority of the world
> that doesn't speak English.  Field *contents*, however, should be
> internationalizable... suggestions on the proper encoding for that?
>
> We have the length of the entirety of the field and restrictions on the field
> name so that the field name is bounded in a known way.  Thus we also have the
> length of the field contents.
>
> Individual 'vendors' may use non-standard field names within reason.  The
> proper use of comment fields should be clear through context at this point.
> Abuse will be discouraged.
>
> Now all we need are a list of 'conventional' field anmes.  A stream is not required to use any/all of these field names, they're suggested for interoperability.  The suggestions below are also biased toward contemporary music album usage; analagous use for non music albums should be easy enough for people to figure out on their own...
>
> TRACK
> ALBUM
> ARTIST
> LABEL
> CONTENT
>
> (so there's the seed of a list.  Please submit obvious one's I've forgotten...)
>
> Monty
>
> --- >8 ----
> List archives:  http://www.xiph.org/archives/
> Ogg project homepage: http://www.xiph.org/ogg/

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/



More information about the Vorbis-dev mailing list