[theora] Liboggplay seeking artifacts
chris at pearce.org.nz
Fri Jul 24 16:56:42 PDT 2009
Yeah, that's the recommended approach for implementing seek in Ogg
files. There's a few things you need to be aware of.
You bisect based on the granulepos of the pages in the file. A page's
granulepos corresponds to the timestamp of the last packet that *ends*
on that page. The frame with that timestamp doesn't necessarily *start*
on that page. You need to start decoding on the page on which the
keyframe *starts*. So when you stop bisecting, you need to scan
backwards through pages until you're sure you're before the start of the
keyframe (while previous page *on that stream* has -1 gp).
You can have other streams' pages interleaved between a keyframes'
pages, with a page from the other stream with a gp > target_gp before
the page which terminates the keyframe. You need to not stop in this gap
in this case. A naive bisection would stop here (liboggz, and thus
liboggplay, is still currently vulnerable to this).
You need to do this backwards-scan for every stream to ensure you can
actually decode all data required to render the keyframe. But you don't
need to do it for streams which you're not playing, and you can't for
streams which have finished at that point in the file - but you can't
tell which streams have finished without reading the entire stream, oh dear!
All those edge cases make it a headache to implement - though a lot
easier now that I've told you some of the gotchas. ;)
I think my main beef against the canonical approach is that it requires
reading backwards, which is bad for apps that read over networks, like
Firefox, because they require new networks requests. Also doing two
bisections is twice as slow as one. On local files you don't notice, but
over the network you do, each bisection is a new HTTP request. That's
why we (Mozilla) are going to change to my proposed method.
The bisection method *is* faster for things stored on local disk, or
with large keyframe intervals, but it's harder to implement robustly
than you think.
All the best,
On 24/07/2009 9:13 p.m., ogg.k.ogg.k at googlemail.com wrote:
>> 1. Determine the maximum time between keyframes. This is the maximum of
>> all streams':
>> (max number of frames between keyframes) * (frame duration)
>> (number of frames between keyframes) = ((1<< granuleshift) - 1)
>> You only really need to calculate this for streams with have a
>> granuleshift (e.g. not vorbis streams).
>> 2. Seek to t - (maximum time between keyframes) (using oggplay_seek()).
>> 3. Decode forward (using oggplay_step_decode()) until the frame you get
>> back has time t.
> The canonical method is very similar:
> 1. seek to the frame just before time you want.
> 2. From the granpos of this frame, deduct the granpos of the preceding keyframe.
> 3. Seek to there.
> 4. Decode until you reach your original target.
> This would work for all streams using the granshift method, whether or not
> the max time between keyframes exists or is practical.
More information about the theora