[xiph-rtp] Chaining

Sat Aug 27 12:15:41 PDT 2005

On Fri, Aug 26, 2005 at 11:27:57PM +0200, Luca Barbato wrote:

> Probably not everybody would like that proposal, but for simplicity sake
> it should be valued as the others.
> 
> 1 Each vorbis-rtp stream will map just to ONE vorbis stream, that mean
> that chaining in rtp isn't allowed at all.
> 
> 2 Once the first chained vorbis ends the server will just reset the
> session parameters, if uses that model, or completely renegotiate the
> connection.

The server can also just transcode. We expect RTP transmission from a
chained Ogg stream to be something of an edge case. A lot of stations
will be encoding either directly from a live production feed (only
one stream in any case) or from a batch encode, for which ensuring
codebook uniformity isn't such a big issue. That just leaves casual
users with a heterogeneous Ogg collection on disk.

Can you explain a bit more about how the server would send the new
session parameters? Is it possible to have that work and keep gapless
playback?

> If possible I'd like to have a list of planned uses for vorbis-rtp (and
> theora-rtp), so it could be possible to find out where that simple
> approach won't work and a complex solution is required.

Not supporting chaining was in fact the original suggestion, made 
initially by Jack about a year ago. If we'd done that we could have
been all finished six months ago. :)

To recap:

As I said above, I don't find support of transmission of chained Ogg
streams per se all that compelling. The whole playlist-based icecast
pseudo-stream isn't worth the pain; it's better to solve the issue
by making the source simplify the stream than to have the server 
and decoders support all this complexity.

For me the two persuasive arguments were:

1. Adaptive bitrate switching. In a unicast RTP setting, the server
can use packet loss statistics to dynamically adjust the bitrate
sent to individual clients. In the case of configurable codecs like
Vorbis and Theora, this means being able to change the codebooks,
and even things like samplerate/framerate and image size (though
the player should rescale to avoid popping in the later case.)

Aaron essentially told us this was a requirement for Real, since
it's already a feature with their native codecs (though they
have fixed codebooks, so chaining isn't painful). That's why
we went with chaining support.

2. Video resampling is much more expensive and artefact-prone
than audio resampling, so at least in the medium term, it is 
attractive to be able to use chaining in Theora to support 
interleave of e.g. film and video without having to do format
conversion. This isn't compelling on its own, but makes reason
1 less lonely. :)

Anyway, that was the reasoning behind the decision. I don't
see any particular reason to revisit it unless you have either
a much simpler method to achieve equivalent results, or a
good argument why Real's requirements aren't worth addressing.

 -r