[theora] Indexing Ogg files for faster seeking

Chris Pearce chris at pearce.org.nz
Tue Jan 26 20:17:37 PST 2010

On 27/01/2010 4:10 p.m., Benjamin M. Schwartz wrote:
> 0. in Ogg Index 3.3, it's perfectly possible for an index to be "verified"
> and yet absolutely useless.  For example, an index that only contains the
> location of the first three keyframes is considered a valid index as long
> as those three locations are correct.
This is true. It's up to the indexer to ensure that an index is useful.

There are plenty of valid HTML web pages out there that aren't any use. 
It's up to the author to make them useful.

What's the alternative? Specify that the index must exactly index every 
keyframe? That denies authors the ability tune their indexes.

> Also, any operation that
> substantially extends the stream will render the index of little use
> without technically invalidating it.
Extending the stream will mean its file's length changes, meaning the 
fishead byte-length header field doesn't match the length of the file, 
meaning it will be considered invalid.

Any operation that changes the file in any way will render *any* index 
suspect enough to be useless.
> 1. Using timestamps instead of granpos seems to break a fundamental
> abstraction barrier in Ogg.  Ogg Dirac, for example, codes frames out of
> order, so the presentation times are not monotonically increasing.
The timestamp stored is the presentation time of the next complete 
keyframe that can be decoded from the keypoint's offset. That is, all 
the data for a keyframe *must* be on or after the page at the keypoint's 
offset, and the keypoint's page offset must be the offset of the first 
page which fulfills that requirement.

I'm not an expert on Dirac. Can it be the case with Dirac that all the 
data for a later keyframe lies before all the data for an earlier keyframe?

> 2. If the reserved space is insufficient, the indexer must currently drop
> entire references.  A lossy scheme would degrade more gracefully, by
> decreasing the precision of all seek points without dropping any (or by
> dropping fewer).
Perhaps. Could you provide a spec for the index packet so we can see how 
this would work?

Chris P.

More information about the theora mailing list