[vorbis-dev] Re: RT & codebooks [was streaming metadata requirements

Wed Jun 13 05:01:19 PDT 2001

On Tue, 12 Jun 2001, Kevin Marks wrote:

> At 8:48 am -0600 12/6/01, Jack Moffitt wrote:
> >SLT - Simulated Live Transport - this kind of stream is where you
> >concatenate a lot of preencoded files, and send them in realtime as if
> >you were genereating them live.
>
> [...]
>
> >This is the normal operation that most people use.  They have a bunch of
> >OGGs lying around, and then put it on random.  They all get concatenated
> >and sent in realtime.  Between each OGG is a sizeable set of codebooks,
> >and the comments.  For the most part this meets a good portion of title
> >streaming needs.
> >
> >The problem with the commercials is, they are tiny.  15 seconds long
> >maybe.  At low bitrates, they are similarly sized to the codebooks
> >themselves.  Or at least the codebooks are a _significant_ portion of
> >the OGG.  It's also not unlikely to have 4+ commercials in a row, 15 or
> >30 seconds long.  So for a low bitrate person, you're probably going to
> >get disconnected, because there's no way you can pull the codebooks that
> >fast without pausing the audio.
>
Consider somebody doing a lot of sample-precise editing (vcut -> cat), for
example to cut out commercials from a live stream ;-).  He will end in a
file with similarly many headers.  There is more than one way to create
the same problem :-)

> [...]
>
> >For SLTA, we really need some way to say "cache the last set of books"
> >or we'll have to reencode the whole thing.  If we reencode the whole
> >thing, we again need the metadata spec :)
>
> [...]
>
> >In either case, I'm a little hesitant to break up a live stream into
> >separate streams, considering the codebook overhead for some cases, and
> >for SLT streams, I think the codebook overhead is too big in some cases.
> >
> >Sometimes I can avoid codebook changes by using a metadata stream,
> >because what I really want is metadata, not a new stream :)  Other
> >times, the codebook overhead is just going to be a given, metadata
> >stream or not.
> >
I think the start of a new stream is more of an internal matter of the
format, it shouldn't be tied to a logical change - otherwise I can't do
sample-precise editing of a file without altering it's logical meaning
(was one stream, became many).  The logical boundaries need metadata
anyway, plobably not CHAPTER comments but a streamed metadata format.

> >In the normal cases though, resending codebooks hasn't been a big
> >problem.  Although if you're paying  bandwidth bills on a few hundred
> >thousand listeners per month with an avg. listening time of 90 minutes
> >(medium sized operation), the codebooks can add up.
>
> OK, I've said this before and I'll say it again.
>
> In practice, many vorbis files will be using standard codebooks. In
> previous posts, you have said that optimising codebooks for a file
> give a maybe 1% advantage.
>
> It makes no sense to add this much overhead to a stream, and it will
> prevent anyone joining a stream in the middle if the have to wait for
> codebooks to come by.
>
It has much for a standalone ogg file sitting on your disk or your
portable player.  My point is that a file should be self-contained as it
is now but a streaming protocol can employ some compression that will be
expanded to a self-contained stream at the receiving end.

> The way JPEG coped with a very similar problem in RTP (sending the
> quantization and Huffman tables) was to set aside a byte in each
> packet to define which one is being used. They then defined 0-127 to
> be predefined ones, and 128-255 to be dynamic ones, and added a way
> to specify the dynamic ones inline at regular intervals.
>
> This makes far more sense than just caching one set. You can define a
> few standard codebooks (by bitrate and content presumably - some
> tuned for classical music, some for metal, some for speech) and give
> them numbers 0-127 in the RTP spec. Others get numbers assigned above
> 128 and are cached by the client or fetched in or out of band as you
> decide.
>
128 will outdate after N releases of oggenc with retuned codebooks, for
large enough N.  Consider also Foo Inc.'s encoder that might come with
different codebooks, but Xiph.org will have used up the 128 range by then
and Foo Inc. will be deproved of the ability to standardise their
condebooks.  It seems a bad idea to me to freeze the space of standard
codebooks, now or anywhen.

> when you have a choice of going from 0 to 1 things cached and from 0
> to many, many is about the same work, but provide larger benefits.
>
In my original message (the one Jack replied to) I suggested caching more
than one.  It's quite simple during one connection, when the server can
agree with the client on IDs of codebooks.  For caching across streams,
you need to define global IDs.  I'd go with using an URL that points to
the codebook as the ID, and actual data can be served out-of-band,
in-band by request, from that URL or whatever other way (and be cached by
the client to the degree it sees right).

Still, this should have a stream-only lifespan IMHO, and it should only be
employed if the client says he is capable of this (and the server is).
Players from disk, dumb clients and dumb server should work unmodified.

-- 
Beni Cherniavsky <cben at crosswinds.net>
                 (also scben at t2,cben at tx in Technion)

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.