[xiph-rtp] Lots of proposals

Fri Sep 2 03:09:06 PDT 2005

David Barrett wrote:

> None taken, but I disagre it's at all unusual: it's the classic 
> VoIP/videoconferencing case.

It is, but as I already pointed out: Vorbis _is not_ suitable for low 
latency use cases like VoIP or video conferencing, so there is no point 
in considering this when writing the RFC. The Vorbis codec itself has a 
realtively high latency, it is not coping very well with packet loss - 
enforcing a longer receive buffer to compensate for transmission jitter 
and it is tuned for music and not for speech compression. For the same 
reasons, noone is seriously using MP3, WMA or similar for telephony or 
conferencing. I'm not sure about this, but I expect Theora to have many 
of the same limitations and I'm not really convinced if it's suitable 
for video conferencing either.

> All if it is going P2P (think Skype) using the same NAT/firewall 
> penetration techniques I'm using.  The primary standards are SIP and 
> RTP (not RTSP), and all the networking is UDP.  If Theora and Speex 
> are to be considered for this entire industry, this problem needs 
> solving in one way or another, and the two options you present are 
> cumbersome.

You already realized this yourself, but my suggestions would work 
perfectly with SIP as well as with RTSP. The only requirement for the 
extra RTP stream to work is that the client is able to setup an RTP 
stream and the server is able to transmit one. If they weren't, they 
wouldn't be able to setup the audio stream either, so I can't think of 
any situation where it won't work. In a SIP situation, the client would 
"dial" the main audio stream, getting a reference to the codebook stream 
in the SDP. It could then dial the codebook address and get the codebook 
while e.g. letting the main address ring.

> (I'm not using RTSP, but an equilvalent method would work with SIP.) 
> Yes, that would work.  But consider the two options:
>
> 1) Establish two RTP streams, deliver codebooks over one, media over 
> another, kill codebook stream when received.
> 2) Establish one RTP stream, deliver codebooks and media on it, send 
> codebook ACK in RTCP profile when received.
>
> Both "work" in the sense that you can write a program to do it in each 
> way.  Both have comparable performance characteristics.  But neither 
> is plainly superior.  I assume #1 fits your architecture better.  I 
> can guarantee, #2 fits mine.

As I already pointed out, I'm not having any architecture to target and 
I am trying to speak generally and not about a specific use case. OTOH, 
if you see for which purposes music streaming is actually being used at 
the moment, you are mainly limited to real time radio broadcasts and 
specific songs on demand. At least I assume that multicast will become 
more popular in the future, at the moment it is simply not supported by 
most IPs or in the internet backbone. The difference I see between the 
two options you describe are that 1 will work for both multicast, 
unicast and without client feedback (and is the only really scalable 
solution yet being discussed) and 2 will only for for unicast with 
client feedback. Especially for multicast scenarios, it is important 
that the mechanism should work without client feedback.

> I'm merely advocating that we are not smart enough to pick the 
> end-all-be-all solution for codebook delivery.  This discussion alone 
> (not to mention the many similar discussions that have proceeded this) 
> is proof enough for me that there are strong opinions and reasonable 
> arguments in favor of competing options.

Yes, if we could just spend time being constructive instead of wasting 
time on discussing completely off the edge problems, we might have been 
much further. As long as people are making definite statements based on 
"random values" and ends up with a conclusion containing results far off 
any realistic values (Luca Barbato: 0,4 vs >8s for codebook delivery) 
and you are trying to make the mandatory part of the RFC fit a use case, 
for which Vorbis was never designed, we won't get anywhere.

> Whether we mandate supporting all, mandate supporting some, or leave 
> everything optional to the developer -- somebody will be dissatisfied. 
> And at the end of the day, I vastly prefer a RFC that errs on the side 
> mandating too little, than mandating too much. 

Yes I may agree on that, but we obviously disagree on exactly what. 
Mandating only inband RTP delivery of the codebook makes it an IMHO 
completley unusable RFC and I expect IETF to share that opinion.

Tor