[ogg-dev] Header packet multiplicity

Tue Feb 12 12:24:08 PST 2008

On 12-Feb-08, at 4:49 AM, ogg.k.ogg.k at googlemail.com wrote:

> Has anyone thoughts on whether multiple header packets are a good  
> idea ?
> I currently have a header per "type" of data, and I'm now at 9  
> headers. The
> original idea was to make it harder to hit a maximum limit, but I've
> since realized
> that the 64 KB (ish) limit is on pages, rather than on packets, so  
> this is moot.
>
> I've recently slightly changed the format of my headers after  
> reading the wiki
> about RTP encapsulation, and the way headers may have to be  
> concatenated
> in some containers. In my particular case, all header packets are
> critical (apart
> from the Vorbis comments packet, actually), and data packets aren't  
> (though
> losing one would lose one text event).
>
> In this light, does anyone know of results (statistical or  
> otherwise) indicating
> whether having multiple smaller packets is better or not than a  
> consolidated
> single packet ?

It's nice if the first header is small, because it lets you find the  
beginning-of-stream (bos) packets for all the streams quickly,  
ideally within a single read buffer, so you can configure the decoder  
pipeline. Currently the bos headers we have for codecs in Ogg are all  
less than 100 bytes.

Otherwise, I don't suppose it matters much. The other main  
recommendation is that all the headers go together at the beginning,  
with a page flush after the last one, so it's easy to (a) find them  
and (b) copy and paste them when you're doing editing.

> Additionally, is there some kind of guidelines about header packet  
> design, and
> the constraints or recommendations about it ? I'm thinking about  
> parsing those
> packets by a parser that doesn't know their format, from, eg,
> http://wiki.xiph.org/index.php/Oggless,  that specify a way to skip
> those headers without actually understanding them.

So I'd say:

* Ogg requires some way to identify the codec from the first packet.

* Lots of frameworks either don't expect headers at all, or treat  
them as a single blob, so they may have to pack multiple headers  
together. Having fixed or internally described packet lengths can  
simplify this.

* Ideally it should be easy to figure out how to interpret the Ogg  
granulepos by grabbing a few bits out of the header packets. For  
example, the theora frame rate can be read directly out of the first  
header packet, but we messed up in that the shift (radix) that  
divides the granulepos field isn't byte aligned in the header. With  
vorbis, one can likewise read the samplerate directly which is enough  
for seeking, but recalculating per-packet timestamps requires parsing  
the mode table out of the third header packet, which is a serious  
pain. Likewise, Dirac has only one header, but it's a couple of pages  
of code to get the framerate out of it because it's all variable- 
length coded with a hierarchy of defaults from tables and explicitly  
coded overrides.

* Likewise, and regardless of whether you're using Ogg, it's nice if  
other parameters that might be relevant to setting up a pipeline are  
easy to extract without a full parser: frame size for video codecs,  
number of channels for audio codecs, that sort of thing.

  -r