[xiph-rtp] P2P Theora Header delivery; why not SDP?

Tue May 10 22:32:07 PDT 2005

Thanks, you and Aaron have given me a good start.  Perhaps I'm confused on
terminology.  I'll define:

- Codebook: The header data generated by the encoder, and required by the
decoder.  In other words, the sum of the Ogg packets output by:
	theora_encode_header( )
	theora_encode_comment( )
	theora_encode_tables( )

- Codebook parameters: The parameters that the encoder uses to generate the
codebook.  In other words, the input to:
	theora_encode_init( )

I was assuming that for a given set of "codebook parameters", there is
exactly one "codebook".  Thus I proposed that rather than send over the
codebook (which is big), just send the codebook parameters (which are small)
and generate the codebook "just in time" before decoding.

However, I think you're saying this assumption is wrong, and that there are
many possible codebooks from the same set of codebook parameters.  (Ie, the
"codebook parameters" are directly tied to a specific version of the
encoding engine; different versions or different encoders might have
different codebook parameters, or use them in different ways.)  If that's
the case, then my proposal obviously won't work.

Is this correct?

-david

> -----Original Message-----
> From: Ralph Giles [mailto:giles at xiph.org]
> Sent: Tuesday, May 10, 2005 9:09 PM
> To: David Barrett
> Cc: xiph-rtp at xiph.org
> Subject: Re: [xiph-rtp] P2P Theora Header delivery; why not SDP?
> 
> On Tue, May 10, 2005 at 07:58:21PM -0700, David Barrett wrote:
> > Hi, I'm a long fan of Theora, and am starting an RTP implementation
> with:
> >
> > http://svn.xiph.org/branches/theora-mmx/doc/draft-kerr-avt-theora-rtp-
> 00.txt
> 
> Hi! Always good to have feedback from another implementer.
> 
> Note that most of the details of that draft have been superceeded.
> Unfortunately no one has fed the new design back into creating a new
> draft. If you're curious, you can troll through the recent discussions
> on the vorbis mapping; all the same issues apply and we intend to make
> the drafts as similar as possible.
> 
> > Am I correct in understanding there are only two ways to deliver
> > configuration headers?
> >
> > 1) "in-band" using RTP
> > 2) "out-of-band" by downloading from a URI specified in the SDP using
> TCP
> 
> Well, "out-of-band" can be however you want, but there are a couple of
> proposals of a how to do the TCP reference in the SDP.
> 
> > If so, how would you recommend implementing this in a P2P setting in
> light
> > of lossy UDP transmission and NAT piercing?  I see the following
> > complications:
> >
> > - With respect to #1, UDP is lossy, and RTP has no standard
> retransmission
> > technique.  Furthermore, due to NAT-piercing issues, the initial RTP
> packets
> > have the highest probability of being lost (because your NAT will block
> my
> > RTP packets until you "punch a hole" by sending a packet back to me).
> 
> Right, this isn't going to work reliably.
> 
> > - With respect to #2, TCP cannot pierce NATs near to the same degree as
> UDP.
> > Thus option #2 limits the range of deployment to those clients between
> which
> > TCP connections can be established.
> 
> If you have a p2p infrastructure, can you use that to achieve lossless
> out-of-band transmission? Some sort of send+ack over udp like or
> outgoing channel to a non-NAT node like you'd use for file transfer?
> 
> > C) Use (B), but pre-populate it with a large library of headers from
> which
> > clients can index in a read-only fashion.  This is better, but only
> works if
> > header-generation is deterministic (ie, headers generated with the same
> > settings are exactly the same).  I assume it is -- can anyone confirm
> this?
> 
> It's entirely up to the encoder. The current reference implementation
> uses a fixed setup for all inputs. This is the same as the VP3 decode
> config, so if you control the clients well enough, you could just
> standardize on that, and add other,better fixed general sets as they
> become available.
> 
> > In this way, I can just generate my headers locally, and use its CRC32
> to
> > index into the server's library.
> 
> Right. We've abandoned the CRC32 because of the risk of collisions
> causing random failures. The new draft will still have a setup id
> in the RTP payload header, but it is only 16 bits. The idea that this
> is an arbitrary mapping between either in-band header packets with the
> same id, or something arranged out-of-band e.g. with the SDP.
> 
> So, for example, you could put a longer (MD5 or SHA1) hash of the
> setup packet in the SDP to indicate to the decoder which one you
> used, and then hardwire a set into the clients, so it's the only
> one used and no one has to fetch anything.
> 
> I guess this is something you'd want to be able to negotiate of in the
> future heterogenous clients could choose the best common header.
> 
> > What I would prefer (and actually expected but was surprised not to
> find)
> > would be a third option where the inputs into the header-generation
> process
> > are simply specified in the SDP itself (on the assumption that
> > header-generation is deterministic from these, and can be computed
> locally).
> 
> That's not possible, except in the sense described above. The whold
> point is for future encoders to be able to make better choices by
> reconfiguring the decoder. This has been very successful with vorbis.
> 
> > For example:
> >
> > c=IN IP4/6
> > m=video  RTP/AVP 98
> > a=rtpmap:98 theora/90000
> > a=fmtp:98 sampling=YCbCr-4:2:2; width=1280; height=720;
> > header=<URI of configuration header>
> > a=theora: <frame_width>x<frame_height>;
> > 	<offset_x>,<offset_y>;
> > 	<width>x<height>;
> > 	<fps_numerator>/<fps_denominator>;
> > 	<aspect_numerator>/<aspect_denominator>;
> > 	<colorspace>;
> > 	<target_bitrate>;
> > 	<quality>;
> 
> Note, of the following the keyframe_frequency_force is the only
> one that actually appears in the info header; the rest are
> (confusingly) part of the encoder config api.
> 
> > 	<dropframes_p>;
> > 	<quick_p>;
> > 	<keyframe_auto_p>;
> > 	<keyframe_frequency>;
> > 	<keyframe_frequency_force>;
> > 	<keyframe_mindistance>;
> > 	<keyframe_data_target_bitrate>;
> > 	<keyframe_auto_threshold>;
> > 	<noise_sensitivity>
> 
> Anyway, something to chew on I hope. I'd like to hear what you think of
> the new draft as an implementor.
> 
> Cheers,
>  -r