[theora] Indexing Ogg files for faster seeking

Basil Mohamed Gohar abu_hurayrah at hidayahonline.org
Thu Jan 21 16:25:16 PST 2010


On 01/21/2010 06:46 PM, Chris Pearce wrote:
> I have been experimenting with compressing the keyframe indexes. If I 
> delta-encode the keypoint offset and timestamp fields, and then 
> variable-byte encode the resulting index, the keyframe indexes compress 
> to 44% of the uncompressed size.
>   
Sorry for my ignorance about this, but is that process all lossless? 
That's a pretty significant savings (albeit, perhaps on a small bit of
data).
> I also tested using zlib1.2.3 to deflate 
> delta-then-variable-byte-encoded indexes. The results are thrown off my 
> small indexes. For the 61 media in my previous sample, the average 
> zlib-deflated size of delta-then-variable-byte-encoded indexes was 
> 100.29% of the delta-then-variable-byte-encoded indexes' size, with 
> stdev of 17.44(%); i.e. it was bigger on average. If we filter that to 
> not deflate the small indexes we mitigate the cost of the zlib deflate 
> overhead:
>
> Compressing only delta-variable encoded indexes > 1,000 bytes, average 
> 93.82%, stdev 3.98%
> Compressing only delta-variable encoded indexes > 2,000 bytes, average 
> 93.33%, stdev 3.89%
> Compressing only delta-variable encoded indexes > 5,000 bytes, average 
> 91.38%, stdev 2.55%
> Compressing only delta-variable encoded indexes > 10,000 bytes, average 
> 90.55%, stdev 2.57%
> Compressing only delta-variable encoded indexes > 80,000 bytes, average 
> 89.07%, stdev 0.67%
>   
Have you considered using another codec, perhaps XZ/LZMA?  Granted, it's
much slower than deflate, but on such a small chunk of data, I think
speed is not such a concern.  It's pretty fast on decode, too.  However,
I'm not sure about memory requirements, which I'm sure are higher than
that for almost any implementation of deflate.

Great work!

-- 
      Basil Mohamed Gohar
abu_hurayrah at hidayahonline.org
http://www.basilgohar.com/blog
basilgohar on irc.freenode.net
GPG Key Fingerprint:  5AF4B362



More information about the theora mailing list