[vorbis] xml stream formats

Michael Smith msmith at labyrinth.net.au
Thu Jul 6 05:02:00 PDT 2000



>> Let me say this again: You CANNOT, in general, stream XML. 
>
>I think what Monty meant here is that the metadata has to be stuck in a
>logical bitstream and optionally sent over a net connection. For static
>metadata, it would be ok to stick it all at the beginning of the
>file/stream and wait until it's all across before parsing.

Yep, that's fine, I understood that. However, if you want to be able to
stream the metadata over a net connection (which is what Monty was saying
was neccesary), you cannot use XML. You can use something like XML, but not
quite (I'll elaborate on why this is BADBADBADBAD below). You can also use
a very strictly defined DTD and then have a pretty hacky approach to
sending pseudo-headers and 'footers'. I don't like this approach, but it's
the ONLY way to stream XML.

>
>> There are two ways in which you can do it: either make the DTD you're using
>> VERY limited, and then do some really ugly horrid hacky stuff at the start
>> and end, or do something that ISN'T quite XML, though it might be close.
>> 
>> The second solution is definately unacceptable.
>
>Can you be more specific here?

Basically, if we were to be using something similar to XML, but not XML, we
would have to: NOT call it XML. Write a parser from scratch for what we
want. The latter is too much to expect of player authors.

>
>> People, please think about this really hard before deciding that XML (or
>> anything XML based) is the way to go. It's a neat technology, but it's NOT
>> suited to everything. I rather think that this is one of the cases where it
>> isn't suitable. 
>
>I feel like we may be having pronoun trouble here. I at least, haven't
>been expecting to be able to stream xml with an arbrary DTD. As I
>understand it, Ogg bitstreams have a head, a body, and a tail. What's
>wrong with constucting the DTD for the so-called timecoded data so that
>it neatly divides into those same three sections and parsing it
>incrementally?

Well, it's not possible to do it quite like that. What you can do is have a
very simple DTD, which allows no tag nesting beyond the second level (the
first level being around the entire contents of the xml 'file' - this is
required. Essentially doesn't exist except to convince the XML parser that
we're sending it XML. You can't just tell the XML parser to ignore this -
an XML parser is REQUIRED to stop parsing at a syntactical error), and then
send the start of the file seperately when a client connects (or whatever),
and the end of the file seperately at the end of the stream.

>
>If the DTD-subspace limitation means it's not really xml, that's fine. The
>main thing that's attractive to me is being able to mark up the text, and
>build on the familiarity of the <tag> model. I think our design
>requirement is just that the encoder needs to be able to split the
>timecoded xml in such a way that it can be (efficiently) parsed in time
>for synchronized display. What am I missing here?

No, we can design a very limited DTD and use that. That's perfectly legal.
However, it WOULD have to be very simple, or we lose streamability. At that
point, the tags are simply named delimiters between sections of text. This
isn't, in the sense you meant, 'marking up the text'. The <tag> model is
still there, but crippled to the point of uselessness. 

So, basically we're then at the point where we have to bring in a big XML
parser (they aren't that huge, but a full XML parser is surprisingly
complex, and correspondingly large) to any player that wishes to use the
metadata, but we aren't doing ANY of the stuff that XML is actually useful
for.

Seriously think about this, then tell me what advantages this has over a
binary (or even textual, it doesn't really matter) format naming a type of
content (this is the 'tag'), a length of data, and then the actual data.
There might be some - I cannot think of any myself, however. Note that this
solution also makes including binary data (if we ever wanted it) MUCH
easier - CDATA sections aren't a whole lot of fun.

The more I think about this, the more I see XML as a 'buzzword solution'
that doesn't really do what we want to do. Sure, we can force it into being
usable for our purposes, but it isn't a natural fit. I may well be missing
something, and I'd certainly appreciate having any such things pointed out.

Michael

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis mailing list