[ogg-dev] Seeking to granules in discontinuous streams

Fri Feb 8 18:08:05 PST 2008

On 2/7/08, ogg.k.ogg.k at googlemail.com <ogg.k.ogg.k at googlemail.com> wrote:

> And this is the crux of the problem, as far as I understand how it works:
> you'll have to do a first bisection to find the page that maps to that
> granulepos,
> then, and only then, you get to know the low bits of the actual granulepos,
> and then you have to do another bisection to find it.

The original seeking algorithm was that: granulepos maps to a
non-decreasing timestamp (in a way defined by the codec) and pages are
strictly ordered by that timestamp. So you seek by bisection searching
for a time. It doesn't matter that you can't compute a granulepos from
a time; all you need is the other direction. What you read from the
Ogg stream is a granulepos from a page header out of some particular
substream, which you *can* convert to a time, and that tells you which
way to jump for your next bisection.

Then since the pages are in order, you just start decoding from a bit
before your seek target *in time* and off you go.

What makes designing Ogg embeddings complicated in working out all the
implications of this for a new style of codec.

> This is *not* handled by Ogg, since the granulepos encoding is wholly out
> of its concern, so there's no way it can optimize out hte second biseciton.
> If you want to be clever and you know clips are short and can't overlap,
> you can get away with manual backward streaming to avoid the bisection,
> but that's still a second seek.

What I was trying to address in my talk, and don't know if it came out
clearly, was that, yes there's no way to avoid the second bisection.
That's bad, but not as bad as you might think because *you already
have to do other bisections anyway*.

Video codecs have keyframes, so you usually have to start way before
your seek target. And because of the problem of continued packets,
it's not enough to find a single granulepos which maps to a time
nearest your seek target, you actually have to treat all the streams
independently. Monty wasn't aware of this when he described the seek
algorithm in 2002 (which works fine if there are no continued packets
or only one logical stream).

Now, this is actually an argument for not having 30 different subtitle
streams all muxed together. However, I think the best way to solve
this is to have a profile require some kind of lower bound on the
frequency of subtitle pages. And perhaps also their coincidence with
video frames, &c. so demuxers have some idea how hard to try to find
such streams.

> Compare this with storing the granulepos of the earliest still active packet,
> which is what I'm doing currently. This means a granulepos doesn't contain
> info on the previous packet's granulepos, but since you have to do two seeks
> anyway, you don't mind.

The advantage of storing this in the granulepos field itself, like
theora and CMML do, is that the seek code may already understand how
to handle the back pointer. Right now everything assumes the mapping
is from 'initialized decoder' + 'granulepos from page header' =>
timestamp, or in the case of theora and CMML => 'timestamp' + 'last
keyframe timestamp.' Having to look in the accompanying page data is
an additional complication.

BTW, I don't think the 'discontinuous codec thing in ogg-multiplex.htm
is a very useful concept. Ogg is conceptually time-continuous media,
and one may trivially make any discontinuous codec continuous by
declaring that no packets mean nothing is to be played during the
interval in question. The real issue is the relative frequency of
pages, and the burden a great asymmetry there places on muxers and
decoders to find all the relevant data. Put another way, start-time
labelling for substreams with low page frequency helps buffering after
they're muxed, but it only goes so far if there's not also a lower
bound on that frequency.

Make any sense?

 -r