[vorbis] xml stream formats

Sun Jun 18 02:07:05 PDT 2000

On Sun, 18 Jun 2000, Michael Smith wrote:

> 
> >We also talked about streaming issues a bit. Jack explained that icecast
> >is going to (does?) insert the three pages of the vorbis header when a new
> >client connects in the middle of a song. This is necessary to set up the
> >decoder, but we get the comment page more-or-less for free. Something
> >similar would have to be done with the timecoded xml streams, since
> >well-formed xml has a header, and there will probably be a small amount of
> >metadata associated with each stream: language, who translated it,
> 
> Ouch. Reading this made me remember something else that I hadn't thought of
> in this context previously - it is NOT possible to stream well-formed XML,
> in general. By limiting yourself in certain ways, you can get away with
> just sending the start of the 'file' (as you've suggested here), then
> streaming - but then you have some subset of XML, rather than XML. 

Depends on the dtd, doesn't it? But yes, you'd have to start streaming at
a point outside any nesting. I'd been implicitely assuming we'd only break
pages there anyway.

> Maybe we have to go back and think about this - is XML really what we need?
> In fact, if we have seperate streams for most stuff, XML really isn't the
> most suitable solution, since it's intrinsically tree-structured. If we
> have seperate streams, isn't each one going to be basically just a sequence
> of (whatever)? A lyrics stream might have a series of lines, each keyed to
> a time, for example.

I guess I like xml because it's a familar model, very easy to edit by
hand, and very flexible. Didn't someone once say that the only reason
everybody thinks xml is tree-based is because that's the DOM datastructure
used in most of the parser libraries? I was thinking about a linear
timeseries, not a (hierarchicly) structured document.

I think if we disallow nesting of the <event> tags, we'll be fine. The
question is, do we lose anything in that? The only occasions I can see it
being useful are: karaoke, where you want to display a line at a time but
highlight each word, in subtitles for an instructional language video for
the same reason, and in the music markup, where you might want to nest
beat marks inside chord changes inside "chorus" and "bridge" groupings.
The traditional sgml way of dealing with this seems to be
"<event1>foo<event2>bar buz</event2>baf</event1>" but I've always found
that ugly.

I'm open to alternative proposals. Byte vectors (like in the vorbis
comment field) each with a length and a presentation timestamp would be
even less flexible.

> Am I missing something? Is there a way around this?
> 
> Michael

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.