[vorbis] Split Ogg Files

Tue Aug 7 11:06:50 PDT 2001

Michael Smith wrote:

> 
> Actually, that's exactly the right way to do it. Some players might treat
> it a bit strangely (time display, seeking, etc. might be a bit 'weird' in
> some players, though most of them are fine), but they should all play it
> correctly.

This is something I've just been wondering about. I'm playing
around with using the
ogg_* and vorbis_* functions directly instead of ov_*, and I'm
not sure how to
handle multiple streams. Right now, it's probably safe to ignore
streams entirely,
and just get pages out of ogg_sync and push them all up to the
vorbis decoder. But,
in the future, there might be multiple concurrent audio streams
in a file, right? 
Even now, we might see chained streams in a single file.

Is there any consensus on what the psudocode _should_ look like
for a player? As
far as whether to check serialno's of streams, which ones to
play, etc?

I had been planning on checking the serialno of the first logical
bitstream in
the ogg file, and dropping pages which belong to any other
bitstreams. That way
the codec won't fall over the first time it sees a multimedia Ogg
file. To
handle chaining, should the codec just start playing the first
logical bistream
which follows the current bitstream? But that can cause problems
in other places.

For example, I'm working on a player which will cross-fade
between songs when it's
playing in 'shuffle' mode. To do this, it needs to look at the
current location and
length of the playing song. When it gets near the end, it queues
up the next song 
and begins fading it in. If the first song keeps playing, and in
fact has it's
length suddenly change, or plays for long past it's length, or
has it's playing
time suddenly run backwards, my player is likely to get confused.

I think it will be necessary for a codec to distinguish between a
seekable (and
hence finite) datasource such as a local file, and a non-seekable
(and potentially
infinite) datasource such as an icecast stream.

For a non-seekable stream, it seems pretty straightforward. There
isn't too much you can do but just play each Vorbis stream as it
comes along. Our psudocode might look like this:
While (! EOF(physical bitsstream))
 -Look for the start of a logical Vorbis stream
 -Play pages from that stream as long as it lasts, ignoring
  any pages from other logical bitstreams that might be there.
 -Always report the current time in terms of the granulepos
  field of the current stream
 -Don't report, or make up, a duration value since we can't know
  when the stream will end (unless we can get this from metadata
somewhere)
 -When the logical stream ends, start looking for another Vorbis
stream to play

For a seekable stream, there are many more choices. The basically
all revolve
around whether we should handle things like the duration, current
time, and
seeking in terms of the current logical bitstream, or the whole
physical bitstream.
It's certainly possible to find all chained bitstreams at the
beginning of playback
and sum up their duration, then keep a running tally of the the
current time. But, such 
efforts will be I/O intensive since the entire file will have to
be scanned in order to determine the length. But, to do otherwise
means that
the current time and duration values are worthless for anything
except displaying
to the user. Trying to use them to schedule a cross-fade or lay
out a CD-R for recording will result in all sorts of problems.

Should seeking work within the whole physical bitstream, or only
within the
current logical bitstream? Both are possible, but the logical
bitstream option 
will be much more efficient. On the other hand, only allowing
seeking within
a logical bitstream will mean that you can't skip to parts of a
song which
are contained in other chained bitstreams. This would result in
wierd 8-Track-like
limitations in songs.

I'll be honest in that I haven't looked too closely at what the
other codecs
are doing, and perhaps I'm making a mountain out of a molehill,
but whether I
handle seeking/duration/time etc. in terms of a single logical
bitstream or
a whole physical bitstream will have a significant impact on the
design of 
the codec I'm working on. It would be nice to know what the
"right" way is
before I get too far into it.

I think what I'm looking for is a consensus on:
A) chained bitstreams should be an expected type of file, and
   should be well-supported. Duration, playing time, seeking,
etc.
   should all work as though the chained streams were a single
stream.

B) chained bitstreams should be an expected type of file, but
   duration, seeking, etc. should only be applied within the
current
   logical bitstream. This may cause problems with some
applications.

C) chained bitstreams are not a well-supported type of file, and
it's OK
   to mishandle them by (for example) only playing the first
logical stream
   and ending. This of course would not apply to a non-seekable
stream.

Am I out in left field here? Opinion appreciated.

-David Mitchell

> 
> Michael
>

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.