[ogg-dev] The use for an XML based metadata format

Ralph Giles giles at xiph.org
Wed Sep 12 11:54:45 PDT 2007

On Mon, Sep 10, 2007 at 01:19:05PM +0100, Ian Malone wrote:

> as artist=(composer) Beethoven), I think cast lists for films present
> a similar problem.  There is consistency and indexability to be
> addressed (Ludvig van Beethoven; Beethoven, Ludvig van;
> Beethoven).

ID3 has a concept of "sort" tags, which provide a string for sorting 
purposes which is different from the (presumedly full name) of the usal 
tag. TSOP="Beethoven", TCOM="Ludwig van Beethoven".

If you want something more precise, you have to link to unique 
identifiers for a particular artist, like a musicbrainz id. I
gather that's not what you're interested in here?

> Finally complex relationships are even harder to
> handle such as specifying a resource's relationship to the rest
> of a collection.

Stepping back a bit, there are three levels of metadata models.

1) The first is just untyped attributes. This includes folksonomy[1]
   tag systems like flickr tags, as well as older systems like 
   "keywords" or "PACS numbers"[2] used for subject indexing by 
   some scientific communities.

   "Beethoven", "Moonlight", "Piano"

2) The second level adds typing to the attributes. This covers all 
   (key, value) pair schemes, including Vorbis comments, EXIF and 
   PNG metadata, and XML attributes.

   Composer="Beethoven", Title="Moonlight Sonata"

3) The third level is what is usually called the RDF model, where
   the metadata is described by a graph. The nodes are items that
   and the labelled edges describe a relationship between nodes.

   Audio title is Moonlight Sonata.
   Moonlight Sonata composed by Ludwig van Beethoven.
   Ludwig van Beethoven has a short name Beethoven
   Moonlight Sonata was composed in 1801.
   Audio performed by Arthur Rubinstein.
   Arthur Rubinstein born in 1887.

I think we need to decide which of these models are implied by our
requirements. Once we know what we have to encode, it will be easier
to setting the encoding issues.

There are other axes, such as whether the categories are ad-hoc, like in 
flickr tags, or reference an absolute collection like musicbrainz ids.


More information about the ogg-dev mailing list