[xiph-rtp] P2P Theora Header delivery; why not SDP?

Tue May 10 23:54:23 PDT 2005

On Tue, May 10, 2005 at 10:32:07PM -0700, David Barrett wrote:
> Thanks, you and Aaron have given me a good start.  Perhaps I'm confused on
> terminology.  I'll define:
> 
> - Codebook: The header data generated by the encoder, and required by the
> decoder.  In other words, the sum of the Ogg packets output by:
> 	theora_encode_header( )
> 	theora_encode_comment( )
> 	theora_encode_tables( )
> 
> - Codebook parameters: The parameters that the encoder uses to generate the
> codebook.  In other words, the input to:
> 	theora_encode_init( )
> 
> I was assuming that for a given set of "codebook parameters", there is
> exactly one "codebook".  Thus I proposed that rather than send over the
> codebook (which is big), just send the codebook parameters (which are small)
> and generate the codebook "just in time" before decoding.
> 
> However, I think you're saying this assumption is wrong, and that there are
> many possible codebooks from the same set of codebook parameters.  (Ie, the
> "codebook parameters" are directly tied to a specific version of the
> encoding engine; different versions or different encoders might have
> different codebook parameters, or use them in different ways.)  If that's
> the case, then my proposal obviously won't work.
> 
> Is this correct?

That is correct. 

To clarify a bit, there are three standard headers in a Theora 
bitstream. Each is a separate packet. We refer to:

1. the 'info' or 'ident' header, the output of theora_encode_header()
2. the 'comment' or 'metadata' header, the output of theora_encode_comments()
3. the 'setup' or 'codebook' header, the output of theora_encode_tables()

These are sometimes referred to collectively as 'the codebooks', but 
this is obviously imprecise. The spec also allows additional optional 
headers, (like an ICC profile) but these must be ignorable and so don't
concern us here.

Of these, 1 and 3 are actually required to properly decode data packets. 
The comment header is required for completeness, but the client can 
construct an empty (or custom) packet if necessary and substitute. None 
of the current implementations actually require it.

So only two of the three have to be transmitted reliably. 1 is what you 
were thinking of as the input to theora_encode_init(). It is very small, 
and as you and Aaron suggest can be included directly in the SDP. The 
idea of this header is the same as with SDP; to identify the stream as 
Theora, and give the externally interesting parameters like frame size 
and rate. The third header is much larger and contains data for the 
configurable parts of the decoder: quantizers, huffman tables and so on. 
So each can be changed intependently of the other, and an encoder or 
decoder need both to function properly.

So yes, while the reference implementation uses the same setup header 
for all input, and only varies the info header, other encoders (can) 
generate different setup headers based not just on the elements of the 
info header, but also on the content itself.

HTH,
 -r