[vorbis] xml stream formats

Ralph Giles giles at snow.ashlu.bc.ca
Wed Jul 5 16:03:33 PDT 2000



On Wed, 5 Jul 2000 robert at moon.eorbit.net wrote:

> Sorry for not responding sooner -- my team (FreeAmp) got laid off a
> couple weeks back and I've been trying to salvage what's left.

Sorry to hear that.

>      >We also talked about streaming issues a bit. Jack explained that
>      >is going to (does?) insert the three pages of the vorbis header when a new 
>      >client connects in the middle of a song. This is necessary to set up the 
>      >decoder, but we get the comment page more-or-less for free. Something 
>      >similar would have to be done with the timecoded xml streams, since 
>      >well-formed xml has a header, and there will probably be a small amount of 
>      >metadata associated with each stream: language, who translated it, 
> 
>      Ouch. Reading this made me remember something else that I hadn't thought of 
>      in this context previously - it is NOT possible to stream well-formed XML, 
>      in general. By limiting yourself in certain ways, you can get away with 
>      just sending the start of the 'file' (as you've suggested here), then 
>      streaming - but then you have some subset of XML, rather than XML. 
> 
>      Maybe we have to go back and think about this - is XML really what we need? 
>      In fact, if we have seperate streams for most stuff, XML really isn't the 
>      most suitable solution, since it's intrinsically tree-structured. If we 
>      have seperate streams, isn't each one going to be basically just a sequence 
>      of (whatever)? A lyrics stream might have a series of lines, each keyed to 
>      a time, for example. 
> 
> I just checked out RDF (Resource Description Framework) which intends to
> describe resources available on the net. I could see us using RDF as the
> format inside the metadata stream, instead of using a full blown XML
> DTD. I haven't used RDF -- Jack, is looks like you've used it. Do you
> have any feedback on using RDF? 

I don't have much experience with RDF, but looking at the buzzword list,
it sounds like it fits our at least or stream-identification metadata
requirement. I'll go read the spec and see what I think about the rest.

http://www.w3.org/RDF/

> In any case, using a complete RDF chunk to describe the time-coded
> information seems like a great amount of overhead. Both XML and RDF seem
> like inappropriate tools for use with time-coded information that is
> in-line with the stream. However, XML and RDF are the best tools that I
> can think of for maintaining the non-time-coded-metadata information.

I agree about the non-time-coded metadata. The problem I'm having with not
using it for time-coded data is that I've not seen a better alternative.
It think it's reasonable to constrain the dtd/schema/stream to the point
where you don't need a generalized parser, but still retain a lot of
flexibility in terms of possible text markup and (compatible) future
revisions.

As a counterexample, if I didn't want to use xmlish, I could take a page
from the vorbis comment header and define a stream type consisting of a
series of length-encoded vectors, each with a presentation start and
presentation stop timestamp (as unsigned 32 or 64-bit integers) and a
string of characters. I'd stick this after a header that (a) identifies
this as a time-encoded text stream (b) gives a ticks-to-realtime-seconds
coversion rate for the timestamps and (c) includes some comment metadata
exactly like the vorbis stream.

Certainly this is easier to parss, taking a screen of code, instead of
several pages. But you've also given up the ability to mark the text up in
any way. What about style changes, like <emphasis> or <strong> or
<foreignphrase>? What if my ogg stream is a public reading of a novel--I'd
sure like to keep the original formatting from the text? How do I insert
scene numbers, or script pagenumbers? I think it's worth the effort to
keep markup capabilities.

You could imagine allowing markup in the the text, but binary represention
for the timecodes, but we can already use the page boundaries to do much
the same thing. And I think the space difference of a text-based encoding
isn't significant compared to the audio and video data. (and if it were,
we could compress it)

> In any case, to answer Ralph's original question, the current TrackInfo
> dtd does not take video into account, but it should be easy to extend
> for use with video as well. But, it sounds as if we need to answer a few
> other questions before we delve into the details of creating the overall
> metadata solution.

Agreed. I was mostly asking for your position in terms of designing the
next iteration of the cdindex db to handle more-than-cd metadata. 

Thanks for your response,
 -ralph


--
giles at ashlu.bc.ca
I went and took an nap and when I came back, my simulation had finished an
impossible number of iterations.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.




More information about the Vorbis mailing list