[theora] Indexing Ogg files for faster seeking

Benjamin M. Schwartz bmschwar at fas.harvard.edu
Tue Jan 26 20:59:22 PST 2010


Chris Pearce wrote:
> What's the alternative? Specify that the index must exactly index every 
> keyframe? That denies authors the ability tune their indexes.

I see two alternatives:
0. Remove the concept of validity/verifiability.
1. Make "valid" represent something closer to "useful".  My best
suggestion is that the index header include b_max, the maximum number of
bytes you might have to skip to get from the nearest upstream index point
to your target seek point (i.e. the page start).  When b_max is zero, the
index is lossless and complete (every page containing a keyframe is
indexed exactly).  An index header is invalid if there is some potential
target page start that cannot be reached without skipping more than b_max
bytes.

>> Also, any operation that
>> substantially extends the stream will render the index of little use
>> without technically invalidating it.
>>    
> Extending the stream will mean its file's length changes, meaning the 
> fishead byte-length header field doesn't match the length of the file, 
> meaning it will be considered invalid.

OK.

> Any operation that changes the file in any way will render *any* index 
> suspect enough to be useless.

I'm not convinced.  I'd rather say that the index serves to initialize a
bisection search, and so, in the worst case, causes one extra bisection
step.  In the best case, it reduces the search to a single step.  If the
file has been changed in some small way, I'd rather hope that the index is
still approximately correct, use it to initialize my search, and risk
taking a single extra step.

> I'm not an expert on Dirac. Can it be the case with Dirac that all the 
> data for a later keyframe lies before all the data for an earlier keyframe?

My head hurts trying to work out the behavior in the case where time is
not monotonic with granpos.  Things should get even more interesting with
rolling intra.

>> 2. If the reserved space is insufficient, the indexer must currently drop
>> entire references.  A lossy scheme would degrade more gracefully, by
>> decreasing the precision of all seek points without dropping any (or by
>> dropping fewer).
>>
>>    
> Perhaps. Could you provide a spec for the index packet so we can see how 
> this would work?

I will be happy to draw up something clearer than this e-mail.

--Ben

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
Url : http://lists.xiph.org/pipermail/theora/attachments/20100126/635fdc1a/attachment.pgp 


More information about the theora mailing list