[vorbis-dev] Metadata streams

rillian at telus.net rillian at telus.net
Fri Apr 13 15:05:30 PDT 2001

On Thursday, April 12, 2001, at 06:30 , Martin C. Martin wrote:

> On Wed, 11 Apr 2001 rillian at telus.net wrote:
>> What sorts of metadata applications are you interested in? I've tried 
>> to
>> collect what requirements I can, but most people don't have much to
>> say...
> Our project is making a mod for Unreal Tournament to take out the
> shooting and turn it into a 3D virtual nightclub.  Think of it as chat
> with live audio, dancing, powerups, etc.  Or think of it as a social MUD
> with a 3D intefafce and music.
> The dancing will be similar to Dance Dance Revolution, Parappa the
> Rapper, Space Channel 5, etc.  Basically, you need to press certain
> buttons on the beats of the music.  So, we need to know where the bars
> begin and end.  Extracting that from music is an unsolved problem (in
> general), so it'll have to be extracted before hand, with a user's help,
> and stored in the file.  We'll also need to know which parts are chorus,
> verse, and bridges without a beat.
> In addition, we might add some sort of lighting information, controls
> for a scripted band, special effects, a stage show, etc.  Since the
> types of info may evolve, it needs to all be backward compatible, and
> there could potentially be a large number of very-low-datarate "streams"
> here.
> Our plan is to have two logical streams, one for the vorbis data, one
> for everything else.  In the "everything else" stream, the first four
> bytes of every packet will be a "type" field.  That way, we don't need a
> separate stream_state for every type (a hastle), and we only need to
> incur one page header for all the extra info combined, not every
> individual type.
> Is that the best way now?  Will that still be the best way in Vorbis
> 2.0?

As long as you use the ogg framing for your substreams, everything's 
fine. We don't intend to be prescriptive about what kind of data can go 
in ogg substreams...it's a general mechanism.

'Best way' is subject to interpretation, though. :) Part of my xml 
proposal was that we abstract the timecode labeling so all sorts of 
synchronized data could share a fair bit of infrastructure. (that's 
pretty much the point of xml) We'd try to develop some standard schema 
for popular uses, and you'd  could write your own for specific purposes. 
Parsing a binary format may be more efficient for something like your 
game, though.

I'd envisioned breaking things up along general lines, like one 
substream per language for lyrics, one for lighting cues, one for beat 
markup. But that kind of modularity is only important if you want a 
broad base of players to support your format.

BTW, my 'transcript' proposal for lyrics and subtitles supports song 
structure markup, but not beat/note-level stuff. The rough draft is here:


My $0.02

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.

More information about the Vorbis-dev mailing list