[xiph-rtp] Lots of proposals
Tor-Einar Jarnbjo
Tor-Einar at Jarnbjo.de
Sun Sep 4 16:35:13 PDT 2005
David Barrett wrote:
> Ok, I think we've covered all the bases we're going to cover. I'm
> going to attempt a summary of my position -- Tor, will you please do
> the same? (ie, please don't respond point-by-point to my position,
> just summarize yours)
Ok, let me first explain why I think inline codebook delivery with or
without client acknowledge is one of the worst methods we have been
discussing until yet:
- You can't make a decent implementation of it for multicast. The only
possibility for inline codebook delivery to work with multicast would be
to continuously transmit the codebook data and hence either waste an
unaccepatable amount of bandwidth or introduce an unacceptable delay at
the beginning of a stream while the client waits for a complete codebook
set to be received. Even if multicast transmissions are not commonly
used today, more and more ISPs are at least starting to experiment with
multicast and it is the only feasible solution to avoid bandwidth
cludges as the internet will be used more as a transport medium for
audio and video streaming. For unicast scenarios, Ogg/Vorbis over HTTP
is already used quite a lot. As long as the Vorbis codec itself has a
realatively high latency and is not designed for low latency "real time"
streaming situations, unicast Vorbis over RTP won't bring much advantage
over Ogg/Vorbis/HTTP. I would expect an RFC for Vorbis over RTP, which
only allows unicast will be very much neglected an not very usable.
- Inline transmission with client acknowledge will not work in
unidirectional network environments. Although this is not very likely
for unicast situations, it will be for multicast, as there may be
situations where the client is simply joining an ongoing session without
server knowledge.
- Even in unicast situations, the delay when starting a stream may be
inacceptable. The codebook header is by the standard not limited in
size, but even if you do some calculus on codebook sizes commonly being
used by current encoders, the codebook transmission will take several
seconds at least. The server would have to stream the codebook at the
same rate as the audio stream, potentially letting the client wait
unnessecary long for the transmission to complete. To stay below the
network MTU, we can assume that a "common size" codebook would be split
into something around 5 RTP packets. In a network with 2% packet loss,
there will be a chance of 9,6% that any of these packets will not arrive
at the client. Hence, it should at least be mandatory for the server to
send the codebook twice _before starting to stream audio at all to
minimize the chance that the entire stream is undecodeable and this
raises the delay before playback can begin accordingly.
- I would expect most usecases for Vorbis over RTP to be web radios and
music "on demand" services. Designing the RFC to only fit well a
situation where multidirectional streams are required (e.g. the "client"
must also be able to transmit its codebook to the "server") is a major
mistake, as it will probably rarely ever be needed.
I am by no means extremely advocating any other solution and there have
been a few other reasonable delivery methods discussed:
- URI reference to the codebook in the SDP. In this case I would suggest
HTTP and whatever protocol being used to setup the RTP stream to be
mandatory. E.g. HTTP and RTSP for an RTSP server or HTTP and SIP for a
SIP client or registrar. At the server side, I would assume that it in
most cases would be feasible to make use of an existing HTTP server to
support HTTP delivery. At the client side, it would not be much effort
to either implement enough of the HTTP protocol or make use of available
HTTP client libraries to fetch the codebook. If HTTP for some reason is
not feasible, the other protocol may be used.
- Agree on a fixed set of codebooks for RTP. Codebook optimizers have
shown to only save a few percent on the file size for streams created by
the reference encoder, so I am not really convinced that dynamic
codebooks are very useful. This may of course be because the actual
stream data created by the reference encoder are fitting the fixed
codebooks well or vice versa. A drawback on this would be that the
decoder software size increases. I've not had time to check the complete
size of all codebooks used by the reference encoder, but as a
comparison, the static codebooks used by WMA could be stored in around
25kB. As pointed out in a response to my question on this subject on the
AVT mailing list, it would be easily feasible for a transmitter to
reencode a local "unsupported" Vorbis stream using with a supported
codebook.
Tor
More information about the xiph-rtp
mailing list