[theora] Indexing Ogg files for faster seeking
chris at pearce.org.nz
Sun Jan 24 16:58:23 PST 2010
On 22/01/2010 1:50 p.m., Bernard Jungen wrote:
> Have you tried delta+deflate, i.e. without variable encoding? And also
> with each kind of data grouped together?
I tried grouping the data in different orders, and it only seems to make
a difference once you get up to huge (>5 hour) media.
On 24/01/2010 4:16 p.m., Bernard Jungen wrote:
> On Sat, Jan 23, 2010 at 11:23:54AM +1300, Chris Pearce wrote:
>> I am happy to accept suggestions for ways to compress the index better.
> Easy. Reduce the checksum size, possibly to zero :-)
Ok, I've measured it, in my simple variable byte encoded index packets,
the checksums (which can't be variable-byte encoded) usually take up
about 55% of the encoded index packet. That's significant.
> Seriously, I'm starting to wonder whether the full 32-bit checksum adds much
> value to keypoints.
> First, the need to rebuild/reload the whole index after modifications plus
> the need for data stability while downloading speak against frequent data
Any time the file changes the index offsets are invalidated. Once the
index is invalidated, there's no safe way to use it. The index must be
rebuilt if the file changes.
> Second, the test to see if there's a valid page at the seeked-to offset is
> already a good rough verification IMO.
The average page length is about 4300 bytes, so there's about a 1 in
4300 chance that any randomly selected byte is a page start. That chance
is large enough that I wouldn't want to solely rely on this to detect
> What could then happen if we let the
> decoder restart at a wrong page?
The basic approach to seeking is to seek to the previous key frame and
then decode forwards to the seek target frame. So if we land way a long
way before the seek target frame, we could spend a lot of time decoding
forwards to the seek target. If we land after the seek target frame (or
after the keyframe we were looking for), we have to fall back to a
> Obviously, if the decoder is minimally
> robust, bad decoded data will result, the decoder will stop, or skip to the
> next keyframe, and in the first two cases the user will probably seek again
> until the test for page start eventually fails.
> Third, can we imagine a useful program that could modify data on-the-fly
> without moving indexed pages nor changing the presentation times? Yes, for
> instance a program that could overwrite specific segments with e.g. *cough*
> ads. But the checksum test would reject them when seeking, although the
> other related keypoint info are still valid.
I think overwriting frames seemlessly would be hard. Anything that
changes the file size would invalidate the index. However, given that,
if we assume that altering a file will likely change its length, then we
can use the length-in-bytes field (which we added to the fishead packet)
in order to detect if the file has likely changed. This is a bit
annoying if we have a chain and we're to doing this over a network. We
can also check that the keypoint's page offset listed in the index maps
exactly to a page boundary as Bernard suggests above. We can also check
that the first complete keyframe that you get when you start decoding
after a seek matches the timestamp listed in the index. That complicates
seeking logic just enough to be annoying, but isn't a deal breaker.
If we do that, we can drop the checksums from the index. On my set of
test media, this reduces the index overhead to about 6.2KB per track per
hour, about 25% of the space required by the original uncompressed index
format (which includes the checksums).
> When looking at compressed keypoints, checksums take a huge chunk of them
> and it would be very useful to give them more thought IMO.
> What do you all think?
I think we should remove the checksums, and simply delta and
variable-byte encode the offsets and timestamps. We can consider an
index invalid if:
1. The segment/link length doesn't match the length stored in the
fishead packet, or
2. After a seek to a keypoint's offset, we don't land on a page
3. The first keyframe we decode after seeking to a keypoint doesn't
have the same presentation time as stored in the index.
Does that sound reasonable?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the theora