[vorbis] xml stream formats

Mon Jul 10 17:41:08 PDT 2000

On Mon, 10 Jul 2000 robert at moon.eorbit.net wrote:

> XML (RDF?) at the beginning of the stream that covers all the time-coded
> metadata for the entire stream?

My idea was continuous XML broken up into packets and interleved with the
other data. The Ogg bitstream routines take care of all that for us. This
generally means the xml must be parsed progressively; I don't actually
know if existing libraries do that. In practice, the text data is so small
relative to audio/video data that we can just squirt it in at the
beginning if that's easier.

> > A "chunk" format with external timecode information. The chunks may or may
> > not include xml or another kind of markup. (I'd go with plain utf-8 text
> > here, myself.)
> 
> And these chunks would be interleaved into the actual stream so that
> they get transmitted as the stream gets transmitted?

Yes. This also makes it less complicated to "splice" in the middle of an
ogg stream since each chunk would be self-contained.

> If I understand your summaries correctly, I really like the latter one
> for its flexibility, but I really don't like having to come up with a
> completely seperate infrastructure for them.

As I see it, the second is the less flexible. There's not much point in
adding infrastructure for the chuck option. The point is for it to be
trivial to implement. Each chunk would basically be:

{ int length; int starttime; int endtime; char text[length-2*sizeof(int)]; }

You read those into a linked list, maybe sort/sanitycheck the timecodes,
and print out the text at the appropriate time. If we want formatting,
additional structure, or anything else more complicated, we might as well
build on xml.

> > I've also been assured that by making some reasonable limitations and
> > assuming a particular dtd/schema we can write a parser for our RDF
> > metadata in one or two thousand lines of C. I don't think there would be a
> > significant space issue in that case.
> 
> It can be done in far less than 2000 lines of C. However, that approach
> opens a gigantic can of worms as far as compatibility is concerned. And
> RDF parser has the same requirements as an XML parser, and any parser we
> could create in a few hundres/thousand lines of C would be incompatible
> with XML.

Sorry, I don't understand your point here. How are the complexity-reducing
limitations I suggested a problem? (assuming they're upwards compatible to
full RDF?)

> > [IMDB]

> I don't think they make the data available -- it would be nice if they
> did. It just a large amount of work that needs to get duplicated if you
> can't get the data...

ftp://ftp.imdb.com/pub/interfaces/

Unfortunately, the license permits only verbatim distribution of the
flat files over the internet. We'd need special permission to use their
data in something like cdindex. 

Of course, as you suggest, we can duplicate the work under a different 
license.

You can read their license here: http://us.imdb.com/Copyright

There's also a history document. For extra credit conpare and contrast
with cddb.org. At least they say going commercial was a difficult
decision. http://us.imdb.com/Help/Oweek/history

Cheers,
 -r

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.