[ogg-dev] The use for an XML based metadata format

Daniel Aleksandersen aleksandersen+xiphlists at runbox.com
Sun Sep 9 14:11:26 PDT 2007

On 2007-09-09, Silvia wrote:
> Hi Daniel,
> > I really have to ask you: Have you even tried to describe media using
> > the excising solutions? I don't mean adding subtitles and editing
> > stuff. I mean really say what a Ogg file contins. There is no working
> > wheels for this. Vorbis comments--used in FLAC too--can describe
> > content with a very limited field names (and badly enforced standards).
> I have and I know they are. I am not saying your work is not needed. I
> am just trying to be careful how to integrate it. Don't get me wrong:
> I like what you do and it is necessary!

It is indeed necessary. I hope this format will be a huge leap in metadata 
descriptions for media content. Not only for music, but any media found in 

> All I did was propose a process that can lead to an integration into
> Ogg. Replacing vorbiscomment is one big request that requires a lot of
> software to change. So, you will need a lot of support from the
> community to go down that path. Only if everybody agrees that this is
> necessary will it happen. So, pushing you to the edge and forcing you
> to defend why is a constructive way for the community to learn. Don't
> give up. :-)

I just though you were practising the Jante Law (look it up in Wikipedia). 
Sorry. ^^

> > On 2007-09-09, Silvia wrote:
> > > Daniel,
> > >
> > > these are all good ideas and worth progressing. However, it may be
> > > better not to merge too many goals in one format (MPEG-7 did that and
> > > ended up as a big mess). So, I suggest to start by structuring the
> > > types of things you want - then finding out which parts belong where
> > > into existing formats such as vorbis comment, Skeleton and CMML, and
> > > only then start to develop a new format.
> > >
> > > For example: the relationships between the logical bitstreams is a
> > > very semantic description - it needs to be broad enough to enable
> > > different types of applications to do different things with it. E.g.
> > > a video editor will need to know that there are 3 audio channels in a
> > > file and how they overlap each other and also the video channel,
> > > while in contrast a speech recognizer might just want to be able to
> > > know about the one audio channel in there that is speech and a music
> > > player would be totally ignorant of the video channel. Just solving
> > > this generically would be a big feat. It would possibly need to find
> > > a place in skeleton.
> > >
> > > A similar argument goes for the encoding quality description and
> > > digital rights.
> >
> > The rights element was taken directly from the Atom 1.0 specification,
> > but with an added 'date' attribute to make this stand data stand out
> > from the text.
> Good. So the rights element is in pretty good shape. Now we need to
> figure out what the atomic element of attribution of the rights
> element should be and then put it in the right location. In today's
> age of mash-ups, should there be a rights element on a file section
> basis? Or just on a per-track basis? Atom does it only on a per-file
> basis IIRC.

As I have intended it, the rights element would appear on audio, video, 
text, image elements. Though there is no limit to how many audio elements 
you have and how many rights children those elements have. So I guess this 
example would explain everyting better:

<?xml ...><metadata ...>
<audio type="audio/flac+lossless" oggserial="somethingsomething">
        <rights date="2019">℗ 2019 Harvey the Wonder 
<audio type="audio/flac+lossless" oggserial="somethingelse">
        <rights date="2020">℗ 2020 Your Mother</rights></audio></metadata>

> > > In contrast, the improved description of the content as in: artist,
> > > band name, title, organisation involved, and people involved are
> > > things that improve upon vorbiscomment and should probably be
> > > included there directly.
> >
> > No. I have though *a lot* about this. You can improve Vorbis comment to
> > some extent. But it is way to limited. Try describing this with Voribs
> > comments and you will see my point:
> > <person role="vocal instrument" uri="http://person.no/"
> > xml:lang="nb-NO">Person Peopleson</person>
> So we're looking at replacing vorbiscomment with a xml version?


To me, Vorbis comments are insufficient. Ask anyone playing in an ensamble 
if they would like recognition. Ask the record label holding the rights to 
a song. Ask the music geek that wants his music collection sorted by 
something other than artist, album, or release year. They would all prefer 
this to Vorbis comments.

> How  intensively have you looked at improving/extending CMML? Subtitles
> for one would definitely be better off being added to CMML than any other
> place.

The idea is to describe the content in a way useful to software like media 
management suites (iTunes, iPhoto, Amarok, Banshee, ...) and players (VLC, 
KMPLayer, QuickTime, ...), to search engines (any kind of desktop search, 
Google, on the Gnutella network, ...), and so on. Basically: Pure metadata. 
I have no intent on making the format useful in other areas. However I am 
not God and does not think I am either. If the format can be made useful in 
other areas; it should be made that way. But for now I am only interested 
in good metadata.

For subtiles, the goal would be to describe that the text with 
oggserial/oggid this and that is a subtitle; and describe who made it, what 
language it is written in, if it's friendly to death people, and so on.

I am asking for help with the development. Any contribution, suggestions, 
and thoughts are welcome. Even improving the markup and language on the 
wiki page would help!

I am probably not the right person to lead this. But I am apparently the 
only one motivated enough to do it. So if anyone else is unemployed at the 
time...do help. Anyone else are free to contribute as well! ;-)
Daniel Aleksandersen

More information about the ogg-dev mailing list