[xiph-rtp] Theora RTP payload format

Ralph Giles giles at xiph.org
Mon Apr 18 13:50:13 PDT 2005


On Mon, Apr 18, 2005 at 03:13:19PM -0400, Steve Kann wrote:

> In one particular use case, (off-line encoding to .ogg files), all this 
> isn't much of a headache. But for use-cases like this, and perhaps for 
> many others, this is quite a headache. For example, If I had all this 
> working with h.263 (or h.264), and I wanted to switch to theora, it 
> would be quite a job, because compared to the design of most video 
> codecs, theora is a square peg when you might have a round hole..

Yes, this is all about the configuration header which is different from 
the way way most other codecs are designed. (Or, as Aaron points 
out, the configuration being more than frame size and rate.) The Vorbis 
audio codec has all the same issues.

Our motivation here was the longevity of the baseline jpeg image format,
still an excellent choice 15 years after it was first developed. We 
didn't think we could be equally well tuned in our first release, so
we designed the format so encoders could have maximum flexibility 
without having the upgrade the installed base of decoders.

It may be that we've bet wrong. Much of the world hasn't seemed to mind 
upgrading repeatedly for each new incompatible iteration of the 'Windows 
Media' format, or "AAC" or even "MPEG-4 video", perhaps because OS 
vendors are shipping the upgrades as a normal part of their systems. So 
longevity (of the codec, not the brand) hasn't been an issue in the last 
couple of years.

> Absolutely, it would be much easier to do, if I could just use the 
> theora implementation with fixed codebooks, and not have to worry about 
> any of this stuff. If VP3 codebooks were an option, that would be 
> excellent.

So we could use one of the 8 reserved bits in my 32-bit aligned payload 
header proposal to mark something like this. I remain unconvinced of the 
value though.

In regards to instructing the encoder on what decoder setup to use, 
Derf's experimental encoder already supports this, and it's on the list 
for the revised reference encoder api. So while you can't do this now 
without some hacking, you should be able to in the future, including
configuring the encoder from a set of codebooks pulled from another 
stream.

 -r


More information about the xiph-rtp mailing list