[ogg-dev] The use for an XML based metadata format

Daniel Aleksandersen aleksandersen+xiphlists at runbox.com
Sun Sep 9 11:52:34 PDT 2007

Hi Silvia,

I really have to ask you: Have you even tried to describe media using the 
excising solutions? I don't mean adding subtitles and editing stuff. I mean 
really say what a Ogg file contins. There is no working wheels for this. 
Vorbis comments--used in FLAC too--can describe content with a very limited 
field names (and badly enforced standards).

I said from the start that I have no idea of what I am doing. This was an 
initiative to make a broad metadata format for describing content. I have 
no doubts others can do it better. This was just a suggestion to make the 
actually developer types on this list think about how it could be done 

More below.

On 2007-09-09, Silvia wrote:
> Daniel,
> these are all good ideas and worth progressing. However, it may be
> better not to merge too many goals in one format (MPEG-7 did that and
> ended up as a big mess). So, I suggest to start by structuring the
> types of things you want - then finding out which parts belong where
> into existing formats such as vorbis comment, Skeleton and CMML, and
> only then start to develop a new format.
> For example: the relationships between the logical bitstreams is a
> very semantic description - it needs to be broad enough to enable
> different types of applications to do different things with it. E.g. a
> video editor will need to know that there are 3 audio channels in a
> file and how they overlap each other and also the video channel, while
> in contrast a speech recognizer might just want to be able to know
> about the one audio channel in there that is speech and a music player
> would be totally ignorant of the video channel. Just solving this
> generically would be a big feat. It would possibly need to find a
> place in skeleton.
> A similar argument goes for the encoding quality description and digital
> rights.

The rights element was taken directly from the Atom 1.0 specification, but 
with an added ‘date’ attribute to make this stand data stand out from the 

> In contrast, the improved description of the content as in: artist,
> band name, title, organisation involved, and people involved are
> things that improve upon vorbiscomment and should probably be included
> there directly.

No. I have though *a lot* about this. You can improve Vorbis comment to some 
extent. But it is way to limited. Try describing this with Voribs comments 
and you will see my point:
<person role="vocal instrument" uri="http://person.no/" 
xml:lang="nb-NO">Person Peopleson</person>

> All I ask for is *not* to reinvent the wheel when there are already
> working, semi-complete metadata formats for Ogg that have been
> carefully prepared to fit with the existing Ogg framework. It would be
> a sheer nightmare to create another new one that does not fit with any
> of the existing ones and is not supported by any media application.
> OTOH, we could. undertake this cleaning exercise also at the end of
> your process when you have all the fields together that you're after.
> We would then sit down and discuss where they are best suitable, if
> you prefer that. This should be made clear though.

I have just tried working on the audio (music, speeches, farting sounds, 
whatever, ...) part. Think about subtitles in the text element and such. 
The person who wrote the text could actually be attributed. Media manager 
softwares could search for movies by subtitle author, for instance.

I do apologise for my tone in this email. I think I got -a little- very 
upset by you not seeing how limited the existing solutions are.

> On 9/9/07, Daniel Aleksandersen <aleksandersen+xiphlists at runbox.com> 
> > On 2007-09-09, Silvia wrote:
> > > Daniel,
> >
> > Hi Silvia,
> >
> > I realise I should have started with this. I got a little carried on
> > with my ideas. Apparently I am no good when it comes to sharing an
> > idea.
> >
> > Short answer: The format should describe media content and relation
> > between them in an Ogg stream. Intended usage is media management and
> > sorting trough search and media manager software.
> >
> > Long answer: See below.
> >
> > > before you step over everything that has been done before, we need to
> > > determine what exactly is the use case for your new specification.
> > >
> > > What concerns metadata, we currently have:
> > >
> > > * vorbiscomment - this is a header at the beginning of a logical
> > > bitstream which has metadata that refers to the complete file; there
> > > is a specification, which has been public for a long time and is the
> > > de-facto standard that is (or should be) used by all software (see
> > > http://xiph.org/vorbis/doc/v-comment.html)
> > >
> > > * cmml - this is a logical bitstream for time-continuous textual
> > > annotations (metadata) for ogg files (see
> > > http://wiki.xiph.org/index.php/CMML)
> > >
> > > * skeleton - this is an extension to the ogg bitstream format, which
> > > has all the encapsulation-specific low-level metadata (see
> > > http://wiki.xiph.org/index.php/Ogg_Skeleton)
> > >
> > > All of these are supported by xiph and may need further
> > > work/extensions or potentially a replacement if they are not fit to
> > > provide what is required.
> > >
> > > Before throwing out more random specifications, could we please look
> > > at what you are trying to achieve with the new format? Can you tell
> > > us where the existing technologies are lacking?
> >
> > What I want is a format to give a detailed description of the content
> > in an Ogg stream. The usage would be improved searchability on local
> > machines (possibly even on the web and file sharing clients too) and
> > sorting in media management software such as Apple iPhoto, Amarok, and
> > WinAmp.
> >
> > Currently only Vorbis comment describe the content. What I aim to is to
> > replace Vorbis comments. Vorbis comments are very limited to a few
> > field names for describing content. There is only a poorly developed
> > look-a-like standard for describing audio files; and all other media
> > formats are left alone. End users may indeed slap on additional field
> > names, but no media management software no search engine know to look
> > for them.
> >
> > Another thing this format describes is relations between media in an
> > Ogg stream. See the audio:collection:artwork element for instance.
> > (Imagine an audio:lyrics element too.)
> >
> > This random specification was intended to start development for a real
> > metadata/content description format. This XML based thing I have put
> > together in a few hours might not be the best. But it does provide a
> > better way to detail describe
> >
> >
> > I have no doubt that others can do this better. But as no one seamed to
> > be working on a description format; I took it upon myself to start
> > working on *something*.
> >
> > Hope this clarifies things.
Daniel Aleksandersen

More information about the ogg-dev mailing list