[vorbis-dev] streaming metadata requirements

Jack Moffitt jack at icecast.org
Tue Jun 12 07:48:10 PDT 2001



> - In the case of commercials mentioned earlier, it's more logical to
> "paste" the once-encoded commercials than encode them every time (unless
> you crossfade them).

I think you've missed a bit of the details.  I'll clarify.

There are basically two types of streams you can create currently.

SLT - Simulated Live Transport - this kind of stream is where you
concatenate a lot of preencoded files, and send them in realtime as if
you were genereating them live.

Live - live streaming is where you encoded one stream, on the fly, in
realtime.  A real radio show, broadcast from line-in on your soundcard,
etc.

Now, let's look at the Live case first.

Say your broadcasting a DJ set live.  It's 3 hours long.  Now you have
someone next to the DJ who is entering the track information into the
stream as it happens.  Currently there is no way to do this, without
ending the current stream, sending new codebooks, and new headers, then
starting the stream again.

This isn't so bad, except that if the data entry is delayed, the stream
boundaries won't have any real significance.  But the slight delay in
data is probably perfectly fine.

Not to mention that we've wasted a bit of bandwidth that was never
really intended.  And before everyone tells me this is nothing, it can
add up.  We're in unicast, remember :)

Now for SLT.

This is the normal operation that most people use.  They have a bunch of
OGGs lying around, and then put it on random.  They all get concatenated
and sent in realtime.  Between each OGG is a sizeable set of codebooks,
and the comments.  For the most part this meets a good portion of title
streaming needs.

The problem with the commercials is, they are tiny.  15 seconds long
maybe.  At low bitrates, they are similarly sized to the codebooks
themselves.  Or at least the codebooks are a _significant_ portion of
the OGG.  It's also not unlikely to have 4+ commercials in a row, 15 or
30 seconds long.  So for a low bitrate person, you're probably going to
get disconnected, because there's no way you can pull the codebooks that
fast without pausing the audio.

The second problem is that one we might want to send some more data,
that isn't appropriate for the comment headers, such as a URL that is
relevant.

So in live streaming, we want to be able to resend comments without
stream initialization.  And I think some metadata format would be good
for this.

For SLTA, we really need some way to say "cache the last set of books"
or we'll have to reencode the whole thing.  If we reencode the whole
thing, we again need the metadata spec :)

> The streams should be separate, on the disk at least.  I think the
> streaming bandwidth waste can be solved by adding a sort of "codebook
> reuse compression" at the streaming layer that restores the chain parts to
> be independent on the recieving side, for either decoding or saving the
> stream.  The server would compare every new codebook to the (remebered)
> previous one and tell the client "insert here the last codebook you
> remember" instead of resending it.
...
> I don't think it should be obligatory for a decoder to support this - only
> if it informs the server that it's capable of it.  
...
> Better yet, define this as a generic extension to the Ogg layer (which is
> anyway intended for streaming): mark some pages/packets/byte ranges (we
> don't know what parts of Tarkin's and other future codecs' data will be
> likely to repeat) as "remember me" and add "include remembered part X
> here" notation.  

I think this idea is fine, but it is unclear whether this is in monty's
vision for Ogg or not.  This has been often discussed, although probably
never discussed with this kind of detail.  The last time Monty and I
talked about this was BEFORE I had started testing vorbis streaming, and
certainly before I had written most of icecast2 :)

In either case, I'm a little hesitant to break up a live stream into
separate streams, considering the codebook overhead for some cases, and
for SLT streams, I think the codebook overhead is too big in some cases.

Sometimes I can avoid codebook changes by using a metadata stream,
because what I really want is metadata, not a new stream :)  Other
times, the codebook overhead is just going to be a given, metadata
stream or not.

In the normal cases though, resending codebooks hasn't been a big
problem.  Although if you're paying  bandwidth bills on a few hundred
thousand listeners per month with an avg. listening time of 90 minutes
(medium sized operation), the codebooks can add up.

Flames, comments?

jack.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list