[ogg-dev] New Ogg Dirac mapping draft

Mon Aug 11 20:03:35 PDT 2008

David Flynn has proposed a new Ogg Dirac mapping. The draft is here:

   http://davidf.woaf.net/dirac-mapping-ogg.pdf

This is a much bigger break from other codecs than my draft (at  
http://wiki.xiph.org/index.php/OggDirac). We talked a bit about it on  
IRC today. Below is my summary; hopefully David can correct anything  
I got wrong or misleading. Comments?

There are two main differences from the earlier proposal:

* The granulepos is split into three fields instead of two, with the  
extra field encoding the reordering offset.

* The mapping requires a page flush after every frame data packet.

The first allows the actual presentation time of the corresponding  
packet to be determined, while in my scheme a group of reordered  
frames all get the same granulepos.

The second assigns a granulepos to every *packet* instead of every  
*page* as is usual, so the granulepos can be used in practice to  
calculate a presentation timestamp for every frame.

An offset to a restart point for restarting after seek is included as  
in my draft.

Pros:

The muxer doesn't have to crack data packets or maintain state to  
figure out the presentation timestamps. Demux code is simpler.

Both presentation and decode timestamps are readily available from a  
simple look at the granulepos on each packet out of libogg.

The encoding is clever, so the frame number calculation by adding the  
two halves according to the skeleton granule shift still works.

Cons:

Restart after seek still requires new code; that part of skeleton  
doesn't work.

Muxing overhead for one-page-per-packet is excessive for small  
packets. 3% (vs 0.7%) for 230 kbps video, up to 10% for 50 kbps. This  
isn't going to work for cell phone video.

Many Ogg tools assume they can repaginate, and probably won't get the  
one-page-per-packet stuff right. This leads to the usual argument  
that the demuxer has to be able to reconstruct the timestamps anyway,  
if it's going to be liberal in what it accepts, so the demuxer isn't  
actually simpler.

The granulepos will no longer be numerically non-decreasing, so  
implementations that make this assumption will break.

Summary:

Adding a new codec has always required code changes to the muxer in  
Ogg. The question is whether this is a better precedent for future  
codecs with future-predicted data. My draft tried to be minimally  
different from previous practice: same granulepos logic as theora,  
custom timestamp generation like all the other codecs. David's draft  
requires new seek as well as timestamp generation code, but moves  
some of the complexity for the latter from simplistic to  
sophisticated implementations.

The idea of one-packet-per-page isn't unprecedented. CSIRO did that  
for their mobile video version of theora (and reported the overhead  
was a real problem). We also talked about long pages without spanning  
packets at FOMS in January. There's a buffering issue with packets  
that are both packed and spanning, and the overhead can actually be  
lower for large (>8K) packets. I'd almost rather see us take this  
route, with a new Ogg page type, if the Dirac developers want a  
timestamp per frame, but that certainly doesn't minimize disruption.

  -r