[vorbis] xml stream formats

robert at moon.eorbit.net robert at moon.eorbit.net
Tue Jul 11 11:36:26 PDT 2000



On 10 Jul, Ralph Giles wrote:
> On Mon, 10 Jul 2000 robert at moon.eorbit.net wrote:
> 
>> XML (RDF?) at the beginning of the stream that covers all the time-coded
>> metadata for the entire stream?
> 
> My idea was continuous XML broken up into packets and interleved with the
> other data. The Ogg bitstream routines take care of all that for us. This
> generally means the xml must be parsed progressively; I don't actually
> know if existing libraries do that. In practice, the text data is so small
> relative to audio/video data that we can just squirt it in at the
> beginning if that's easier.

Both expat and libxml support progressive parsing -- no problem there.
However, this approach has a bit more overhead than I care for:

1) The parser object needs to stick around for the duration of playing
 back an ogg stream. XML parsers are not trivial pieces of code to hang
 around -- the end result is a memory footprint that larger than
 necssary.
2) XML/RDF Markup for a section of time-coded data introduces more data
 overhead. This is small when compared to the ogg stream itself, but
 anything around 100% data overhead is tough to swallow. 

>> > A "chunk" format with external timecode information. The chunks may or may
>> > not include xml or another kind of markup. (I'd go with plain utf-8 text
>> > here, myself.)
>> 
>> And these chunks would be interleaved into the actual stream so that
>> they get transmitted as the stream gets transmitted?
> 
> Yes. This also makes it less complicated to "splice" in the middle of an
> ogg stream since each chunk would be self-contained.

Ok, makes sense. What if the header XML/RDF specifies the format and
context for the chunks? The RDF could specify what kind of time coded
information is in the chunks, how big the chunks are, where to find
time-stamp info, etc. This should be very little parsing overhead during
playback.

>> It can be done in far less than 2000 lines of C. However, that approach
>> opens a gigantic can of worms as far as compatibility is concerned. And
>> RDF parser has the same requirements as an XML parser, and any parser we
>> could create in a few hundres/thousand lines of C would be incompatible
>> with XML.
> 
> Sorry, I don't understand your point here. How are the complexity-reducing
> limitations I suggested a problem? (assuming they're upwards compatible to
> full RDF?)

Any parser that we write in 2000 lines of C will not really be XML/RDF
compatible -- any complexity reductions will throw us out of full XML
compliance. And anything less than full compliance is not really
acceptable. Let me throw out an example here:

1) We create a 2000 line wonder XML/RDF parser and it becomes the
 'standard' for ogg implementations. One of the complexity reducing
 limitations we introduce is supporting only UTF-8.
2) Company XYZ creates some tool that works with ogg metadata, but they
 do not use our parser. Instead they choose to use a fully compliant XML
 parser.
3) They kick out files that are completely compliant to XML, but let's
 say the XML data is encoded in UTF-16, rather than UTF-8.
4) Existing ogg players cannot play these streams because our parser
 cannot handle UTF-16.
5) The community will consider the files created by XYZ company to be
 non-compliant, because standard ogg implementations choke on it.
 However, XYZ company did nothing wrong...

Unless we can be 100% compliant with any of the standards that we use,
we shouldn't use them. So the question then becomes this: is XML too
large/heavy/whatever for ogg? 

> ftp://ftp.imdb.com/pub/interfaces/
> 
> Unfortunately, the license permits only verbatim distribution of the
> flat files over the internet. We'd need special permission to use their
> data in something like cdindex. 

I doubt that will happen -- we would be in direct competition with them,
and their little disclaimer is to prevent competition in the first
place.

> Of course, as you suggest, we can duplicate the work under a different 
> license.

I'm never fond of duplicating work. :-(

--ruaok         Freezerburn! All else is only icing. -- Soul Coughing

Robert Kaye -- robert at moon.eorbit.net  http://moon.eorbit.net/~robert

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis mailing list