[xiph-rtp] Difficulties with several RTP streams for a Vorbis stream.

Ramón García ramon.garcia.f at gmail.com
Sun Oct 31 09:48:24 PST 2004


DirectShow, GStreamer and Java Media Framework are multimedia
processing systems based on graphs of processors. Each processor
receives frames in a format and/or output frames in another format.
These processors communicate through links, so that the input of a
processor is the output of another. The stream of data that goes from
one processor to another is intended to represent an independent (that
is, that can be displayed/listened by itself) stream of data. Some
file formats store several streams of data inside a file. In this case
a demuxer demultiplexes the stream into several streams. Sometimes one
wants to store several streams in the same file (or network stream).
In this case a mux multiplexes several incoming streams into one
output stream.

In Directshow and Java Media Framework RTP support is composed of two
processors. The first of them, the RTP receiver, simply gets RTP
packets and outputs them. A second processor is specific of each
payload type and translates the packets into frames in the format
expected by other processors: for instance, it produces MP3 data in
the format expected by MP3 decoders of DirectShow.

In every case I know of, a RTP session SSRC corresponds to one stream
of the multimedia architecture, because it is the most simple
approach.

The proposal of Tor-Einar would require two streams, and a mechanism
for linking them, perhaps the SDP file would have an attribute of the
data session for the stream of the codebook session. This is
surprising for any multimedia framework, that one RTP session contains
data that is necessary for decoding the second.

One can get exactly the same effect with less surprises for
implementors. The RTP data stream contains packets, and each packet
contains data that says what codebook is needed for decoding it. It
could be a 16 bit checksum of the codebook. Clients are expected to
have codebooks stored in disk. If a client does not have a particular
codebook, it can send a request to a server through RCTP APP with the
identifier of the codebook. In that case, the server can include the
requested codebook in the next RTP data packet.

Compared to using an URL this approach is scalable in a multicast
scenario. One codebook delivery is useful for all clients. It adds 16
bits to the payload. Taking into account that the RTP header is at
least 128 bits (64 bits + 1 SSRC + 1 CSRC), this is not significant
overhead. Are 16 bits enough? The probability of collision is

n*(n - 1)/2/2^16

n = number of codebooks.

As the number of codebooks is in the order of 10, that probability is
small enough.


More information about the xiph-rtp mailing list