[vorbis-dev] Metadata Schema

rillian rillian at telus.net
Thu May 31 11:51:26 PDT 2001



On Thursday, May 31, 2001, at 08:37 , Jon Doda wrote:

> There's been talk off and on about an xml metadata format, but 
> nothing's ever really been done.  In the intrest of giving it a kick in 
> the pants I've decided to sit down and write a preliminary Schema for 
> it.  I've got a few other things to get of my plate first, so I can't 
> start on it right away, but I did want to begin collecting information 
> on what people want to see in the format.  If you've got any great 
> ideas or burning desires for what you'd like to see in the format than 
> please share them.

Thanks for taking a crack at this. :-)

There are a couple of classes of things people have requested. Generally 
you can look at the id3 mess and see everything someone would like to 
have. In the interests of sanity, I'd divide it this way:

Kitchen-sink metadata--everything you might want to record about the 
file. Who played what, producers, engineers, who the band thanked, etc. 
MusicBrainz is a good place to start for this. The dublin core group is 
finally reaching some consensus on how to do dc metadata in rdf/xml as 
well, so that's a good place to look for format.

Lyrics, tablature, beat and chord markup, dynamics, and that sort of 
thing. In id3 (and  MusicBrainz) these are just another data chunk, but 
in the context of ogg, I think they belong in separate substreams. The 
criterion I finally arrived at for the distinction is whether it's 
something that *might* have timecode information associated with it. And 
presumedly the method of timecoding would be generalized over all these 
subtypes so encoders/servers can sync them without knowing the details 
of the particular schema.

People want to associate images, e.g. for slideshows or to include cover 
art. In the latter case, separate files on disk make more sense, but in 
a streaming context, it's perfectly reasonable. My proposal has always 
been to use mng in a separate substream. It's an open, patent-free, 
finished spec that provides a superset of both png and jpeg, and allows 
timed animation like we'd want for slideshows. Plus, when we get to 
video, it's a good choice for dvd-style overlays.

Finally, we want some sort of table-of-contents format to help the 
player sort through the plethora of streams. This is especially 
important for video, where we might have multiple audio and video 
streams, graphic overlays, and so on in addition to the metadata. 
Everyone but me wants something easier to parse than xml for this.

One of the hard parts with the metadata is considering how to make 
things work for video as well. For example, you want to encapsulate the 
entire credits listing for a film, but they're quite varied, and there 
are quirky issues like order mattering.

Likewise, for lyrics, it would be great if that could be used as well 
for subtitles and karaoke. Some time ago, I proposed a transcript format 
(mostly by example, there's no formal schema) that tried to handle all 
that:

     http://www.xiph.org/archives/vorbis/200008/0082.html

As I say at the top of the post, the part I'm not really happy with is 
being able to use the same format for both subtitle display and as a 
complete, stand-alone document. Unfortunately, theater and screenplay 
formats have considerably more structure and a simple lyrics listing. 
Comments on this would be very welcome.

> As an aside, if it's in a seperate stream it really ought to have an 
> official name.  In keeping with the general naming scheme for Ogg 
> codecs, and to honour a great, recently departed science fiction author 
> I'd like to propose the name "Prefect".

appropriately obscure. :)

Note that the type detection is a bit trickier if you're going to use 
xml. My general thought was to do magic detection on the initial 
'<?xml', since that's the only part guaranteed to be at a given offset. 
Pass that to a general xml 'codec' that can read the 
doctype/dtd/schema/top-level tag and decide if/how to really parse it.

Likewise with packetization, since xml isn't designed to do that.

I've been assured that one can write a small, limited parser for a 
specific schema very easily, but others have said, no, at least not 
while still allowing the full xml spec. Is that somthing you can comment 
on? I don't really know the spec well enough to have a separate opinion, 
but codesize is a big issue for a lot of player implementations.

Anyway, that's my usual summary,
  -ralph

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list