[xiph-rtp] P2P Theora Header delivery; why not SDP?

Tue May 10 21:09:19 PDT 2005

On Tue, May 10, 2005 at 07:58:21PM -0700, David Barrett wrote:
> Hi, I'm a long fan of Theora, and am starting an RTP implementation with:
> 
> http://svn.xiph.org/branches/theora-mmx/doc/draft-kerr-avt-theora-rtp-00.txt

Hi! Always good to have feedback from another implementer.

Note that most of the details of that draft have been superceeded. 
Unfortunately no one has fed the new design back into creating a new 
draft. If you're curious, you can troll through the recent discussions
on the vorbis mapping; all the same issues apply and we intend to make 
the drafts as similar as possible.

> Am I correct in understanding there are only two ways to deliver
> configuration headers?
> 
> 1) "in-band" using RTP
> 2) "out-of-band" by downloading from a URI specified in the SDP using TCP

Well, "out-of-band" can be however you want, but there are a couple of 
proposals of a how to do the TCP reference in the SDP.

> If so, how would you recommend implementing this in a P2P setting in light
> of lossy UDP transmission and NAT piercing?  I see the following
> complications:
> 
> - With respect to #1, UDP is lossy, and RTP has no standard retransmission
> technique.  Furthermore, due to NAT-piercing issues, the initial RTP packets
> have the highest probability of being lost (because your NAT will block my
> RTP packets until you "punch a hole" by sending a packet back to me).

Right, this isn't going to work reliably.

> - With respect to #2, TCP cannot pierce NATs near to the same degree as UDP.
> Thus option #2 limits the range of deployment to those clients between which
> TCP connections can be established.

If you have a p2p infrastructure, can you use that to achieve lossless 
out-of-band transmission? Some sort of send+ack over udp like or 
outgoing channel to a non-NAT node like you'd use for file transfer?

> C) Use (B), but pre-populate it with a large library of headers from which
> clients can index in a read-only fashion.  This is better, but only works if
> header-generation is deterministic (ie, headers generated with the same
> settings are exactly the same).  I assume it is -- can anyone confirm this?

It's entirely up to the encoder. The current reference implementation 
uses a fixed setup for all inputs. This is the same as the VP3 decode 
config, so if you control the clients well enough, you could just 
standardize on that, and add other,better fixed general sets as they 
become available.

> In this way, I can just generate my headers locally, and use its CRC32 to
> index into the server's library.

Right. We've abandoned the CRC32 because of the risk of collisions 
causing random failures. The new draft will still have a setup id
in the RTP payload header, but it is only 16 bits. The idea that this
is an arbitrary mapping between either in-band header packets with the 
same id, or something arranged out-of-band e.g. with the SDP.

So, for example, you could put a longer (MD5 or SHA1) hash of the 
setup packet in the SDP to indicate to the decoder which one you
used, and then hardwire a set into the clients, so it's the only
one used and no one has to fetch anything.

I guess this is something you'd want to be able to negotiate of in the 
future heterogenous clients could choose the best common header.

> What I would prefer (and actually expected but was surprised not to find)
> would be a third option where the inputs into the header-generation process
> are simply specified in the SDP itself (on the assumption that
> header-generation is deterministic from these, and can be computed locally).

That's not possible, except in the sense described above. The whold 
point is for future encoders to be able to make better choices by 
reconfiguring the decoder. This has been very successful with vorbis.

> For example:
> 
> c=IN IP4/6
> m=video  RTP/AVP 98
> a=rtpmap:98 theora/90000
> a=fmtp:98 sampling=YCbCr-4:2:2; width=1280; height=720;
> header=<URI of configuration header>
> a=theora: <frame_width>x<frame_height>;
> 	<offset_x>,<offset_y>;
> 	<width>x<height>;
> 	<fps_numerator>/<fps_denominator>;
> 	<aspect_numerator>/<aspect_denominator>;
> 	<colorspace>;
> 	<target_bitrate>;
> 	<quality>;

Note, of the following the keyframe_frequency_force is the only
one that actually appears in the info header; the rest are 
(confusingly) part of the encoder config api.

> 	<dropframes_p>;
> 	<quick_p>;
> 	<keyframe_auto_p>;
> 	<keyframe_frequency>;
> 	<keyframe_frequency_force>;
> 	<keyframe_mindistance>;
> 	<keyframe_data_target_bitrate>;
> 	<keyframe_auto_threshold>;
> 	<noise_sensitivity>

Anyway, something to chew on I hope. I'd like to hear what you think of 
the new draft as an implementor.

Cheers,
 -r