[xiph-rtp] P2P Theora Header delivery; why not SDP?

Tue May 10 21:47:50 PDT 2005

Hi David,

I'll try to address as many of your concerns as I can.

On Tue, May 10, 2005 at 07:58:21PM -0700, David Barrett wrote:
> Hi, I'm a long fan of Theora, and am starting an RTP implementation with:
> 
> http://svn.xiph.org/branches/theora-mmx/doc/draft-kerr-avt-theora-rtp-00.txt
> 
> Am I correct in understanding there are only two ways to deliver
> configuration headers?
> 
> 1) "in-band" using RTP
> 2) "out-of-band" by downloading from a URI specified in the SDP using TCP
> 

Yes. The second method should be reworded to be something along the lines of
out of band delivery. The URI via SDP mechanism is really only one way to
do things. There are may different other ways that are possible. All you
need is a reliable way to transmit the ident and codebook information reliably
between the two end points. How that is done doesn't really matter.
The SDP work that has been discussed on the list mainly targets the RTSP + RTP
use case.

> 
> If so, how would you recommend implementing this in a P2P setting in light
> of lossy UDP transmission and NAT piercing?  I see the following
> complications:
> 
> - With respect to #1, UDP is lossy, and RTP has no standard retransmission
> technique.  

While it is true that there isn't and IETF standard yet for retransmission,
a draft is in the works.

http://www.ietf.org/internet-drafts/draft-ietf-avt-rtp-retransmission-11.txt

You could also just use RFC 3611 to signal lost packets and the packets
could just be retransmitted. 

> Furthermore, due to NAT-piercing issues, the initial RTP packets
> have the highest probability of being lost (because your NAT will block my
> RTP packets until you "punch a hole" by sending a packet back to me).

What NAT scenario are you intending to support? Can both peers be behind a 
NAT? If so then usually something like STUN is used deal with the NAT
traversal problems. Your description of a NAT traversal problem confuses me.
It is usually the client behind the NAT that does the "punch a hole". This 
should be done BEFORE media is sent so that you don't lose anything. Your
basically telling the NAT how to route packets from the outside to you. If
both peers are behind a NAT you'll need to do something like what STUN does
because there is no way to know for sure what your port will be on the other
side of the NAT.

> 
> - With respect to #2, TCP cannot pierce NATs near to the same degree as UDP.
> Thus option #2 limits the range of deployment to those clients between which
> TCP connections can be established.

Like I said above #2 does not necessarily impy TCP. It does imply reliable
delivery between the 2 endpoints. I'm assuming that you already have some sort
of protocol to communicate between the peers. I'm also assuming that it is
reliable in some form. If that is the case that you can transmit the codebook
 and ident info over that. The URL model that has been discussed on the
mailing list basically just allows you to specify where the codebook and ident
info is located. If both peers are behind NATs then they would likely have to
post their codebook and ident info to a server outside their NATs. Then they
could use HTTP to retrieve eachothers info.

One other possibility if you are using a fixed set of ident and codebooks is
to use some sort of offer/answer model. The ident info is small enough that
it could be sent as it is. For the codebooks you could just send an MD5 hash
of the codebook. Once they agreed on the codebook to use your done. You don't
need to send the codebook because the negotiation of MD5 hashes told each 
peer which codebook to use.

> 
> 
> As such, the following semi-compliant techniques come to mind:
> 
> A) Use "in-band" transmission and keep resending the headers until the first
> RTCP packet has been received (effectively using to ACK the session).  The
> spec says "Clients MUST be capable of dealing with periodic re-transmission
> of the configuration headers", so this should work in theory, but it
> certainly isn't intended.

This solution is fine. Reception of an RTCP packet doesn't imply that the
codebook was received though.  You would either need to send RFC 3611
packets so that you could signal what packets have been received, or just
periodically send the info throughout the duration of the stream.

> 
> B) Use "out-of-band" transmission with the help of a third party (publish
> the headers to some TCP-enabled third party, and download from there).  This
> adds centralization to an otherwise decentralized problem, along with its
> resultant complications (scalability, authentication, etc.).

This would only be needed in the worst case scenario where both peers were 
behind a NAT and their NATs assigned outside ports based on 
<src IP, src port, dest IP, dest port> tuples. In that case you have to use
a third party anyways. If you use STUN, then you don't have to worry about 
this because it takes care of establishing the link between the peers and
it already establishes an outside third party, the STUN server. 

> 
> C) Use (B), but pre-populate it with a large library of headers from which
> clients can index in a read-only fashion.  This is better, but only works if
> header-generation is deterministic (ie, headers generated with the same
> settings are exactly the same).  I assume it is -- can anyone confirm this?
> In this way, I can just generate my headers locally, and use its CRC32 to
> index into the server's library.

I think it would be good to have a central codebook server. This way you
could have a well known source for codebooks. It could be mirrored and
existing HTTP proxy and caching infastructure could help with scalability.
Personally I'd use MD5 or SHA for the codebook hash just to be extra sure that
the IDs are unique. 

> 
> 
> A and C seem like the best option to me so far, but both still rather suck
> (A is unreliable, and C imposes a central solution into the mix).  
> 
> What I would prefer (and actually expected but was surprised not to find)
> would be a third option where the inputs into the header-generation process
> are simply specified in the SDP itself (on the assumption that
> header-generation is deterministic from these, and can be computed locally).

I'm not sure if you are asking whether a finite set of parameters will dictate
what the codebook will be. Right now in the reference code the same codebook
is used no matter what the encoding parameters are. This will likely not
be true in the future. Different encoders may accept different parameters
for encoding and make codebook selections based on different criteria. That
is why the codebook needs to be sent in the first place. It IS the minimal
information needed to tell the other side how the video is encoded.

> For example:
> 
> c=IN IP4/6
> m=video  RTP/AVP 98
> a=rtpmap:98 theora/90000
> a=fmtp:98 sampling=YCbCr-4:2:2; width=1280; height=720;
> header=<URI of configuration header>
> a=theora: <frame_width>x<frame_height>;
> 	<offset_x>,<offset_y>;
> 	<width>x<height>;
> 	<fps_numerator>/<fps_denominator>;
> 	<aspect_numerator>/<aspect_denominator>;
> 	<colorspace>;
> 	<target_bitrate>;
> 	<quality>;
> 	<dropframes_p>;
> 	<quick_p>;
> 	<keyframe_auto_p>;
> 	<keyframe_frequency>;
> 	<keyframe_frequency_force>;
> 	<keyframe_mindistance>;
> 	<keyframe_data_target_bitrate>;
> 	<keyframe_auto_threshold>;
> 	<noise_sensitivity>
> 
> When spelled out in prose it looks like a lot of data, but in practice it'd
> actually look something like:
> 
> c=IN IP4/6
> m=video  RTP/AVP 98
> a=rtpmap:98 theora/90000
> a=fmtp:98 sampling=YCbCr-4:2:2; width=1280; height=720;
> header=<URI of configuration header>
> a=theora: 96x64; 0,0; 96x64; 15/1; 0; 45000; 0; 0; 1; 1; 64; 64; 8; 67500;
> 80; 2
> 
> Is this possible?  Looking back over the [xiph-rtp] list I see a lot of
> discussion about static, cached, and downloadable codebooks, but I don't see
> where SDP is mentioned as an option.  Has this option already been
> considered and discounted?

You could do this for the ident info because that data is small, but the
codebook itself would be too large to put in the SDP. That is why we have been
discussing the downloading mechanisms. If I were to put the ident info in 
the SDP I'd just Bin64 or hex encode the ident packet. It would just make
it more compact and eliminates a translation step.

I hope this provided answers for most of your concerns.

Aaron

> 
> -david
> 
> _______________________________________________
> xiph-rtp mailing list
> xiph-rtp at xiph.org
> http://lists.xiph.org/mailman/listinfo/xiph-rtp
>