[xiph-rtp] Theora RTP payload format
Ralph Giles
giles at xiph.org
Mon Apr 18 14:52:16 PDT 2005
On Mon, Apr 18, 2005 at 01:50:13PM -0700, Ralph Giles wrote:
> On Mon, Apr 18, 2005 at 03:13:19PM -0400, Steve Kann wrote:
>
> > In one particular use case, (off-line encoding to .ogg files), all this
> > isn't much of a headache. But for use-cases like this, and perhaps for
> > many others, this is quite a headache. For example, If I had all this
> > working with h.263 (or h.264), and I wanted to switch to theora, it
> > would be quite a job, because compared to the design of most video
> > codecs, theora is a square peg when you might have a round hole..
>
> Yes, this is all about the configuration header which is different from
> the way way most other codecs are designed.
Just to be clear, the flexibility of the vorbis setup headers have
served us very well. The irony of that statement is that linux
distributions are the only significant os vendors shipping our codecs as
a matter of course. The fact that a beta3 decoder release can play
files from aoTuVb4 with better quality at half the bitrate is a
significant acheivement.
So yes, the flexibility means more work at the front end, and yes the
CRC32-as-ident proposal would have traded the explicit chainid mapping
table for an implicit one. We've generally found dealing with the setup
overhead isn't as complex as you're expecting. The idea is that doing a
little more work up front is easier than having the mass-upgrade your
installed base in two years.
It's nice when it's easy to get things 'just working' quickly, but it's
also nice to do things right. You were already talking about negotiating
a common frame size and rate, and the rtp server mixing the streams
together, which I understand affects the SSRC and CSRC RTP header
fields, only switching on keyframes and so on, all of which requires at
least a little bit of codec knowledge. And theora, at least, is designed
so things like header and keyframe packet detection can be done easily
without a full decode. (Just by looking at the first byte for those
cases.)
Our concern with defining profiles, like the 'VP3' bit I suggested has
always been encouraging inoperable implementations that only support
that profile. "profiles are useless" has been a common lesson of many
specification designs. They make committee decisions easier, but then
end you either implement the de facto standard or you don't. Those are
the main reasons I remain unconvinced.
Note also that while the chain id lets you multiplex streams from
encoders using a different setup, you don't have to do it that way.
Your application might be better served by mandating that everyone
use the same profile and then not worry about chaining at all. That's
more like the situation you have with fixed setup codecs.
I hope that explains the design reasoning a bit better, and why we've
been resistent to things like static codebook sets. We do very much
appreciate your opinion and contribution to the design discussion and
are very willing to help you figure out what needs to be done to make
your implementation work well.
-r
More information about the xiph-rtp
mailing list