[ogg-dev] Seeking to granules in discontinuous streams

Mon Feb 18 02:21:28 PST 2008

Hi,

I've now got another way of encoding granule (oh, not *again*, I hear
you cry). I believe it's an improvement over the existing "generic"
method, so I'll outline it here for your consideration. Again, it's
different from what Skeleton can handle, but it's a simple superset and
would be easy to add to Skeleton (and liboggz). It is also compatible
with other existing codecs (eg, theora). It is implemented in libkate,
so it works and the changes needed are minimal. I'm happy to make the
necessary changes in oggmerge/liboggz if needed.

The two advantages this change brings are increased usable time (for the
same timestamping precision) and the ability to map different granules
to the same time. Both are adjustable by client code.

That said, I see the interest of having a shared method of mapping time
and granules, and I'll use the Theora/CMML method if this proposed
method doesn't convince you.

See the end of the mail for the trivial code difference between the two
methods.

The proposed mapping is based on the existing granule shift based
system, and adds two more quantities:
- a second granule rate fraction
- a second granule shift

If the second granule rate fraction is equal to the main granule rate
fraction, and the second granule shift is zero, the system is equivalent
to the existing one, ensuring backwards compatibility (eg, for those
older streams, one can set the second fraction to the first one, and the
second granule shift to zero).

The current system uses the same granule rate for both the base (the
high bits) and the offset (the low bits) of the granulepos. Here, a
separate granule rate (typically coarser) is applied to the base,
allowing the base to span a larger timespace. Since the offset added to
that base uses the finer grained granule rate, the precision of the
resulting timestamping remains (as long as the time than an offset can
span is less or equal to the granularity of the base, but that does
leave quite a bit of leeway). In effect, the existing method maps a lot
of different granules to the same time (though not contiguous granules).
In other words, a particular timestamp can be represented by a lot of
different granules. This is where the new method gets bits back without
sacrificing timestamping precision.

Additionally, a second granule shift can be used to reserve a few low
bits so that they will not be used in the timestamping computation. This
allows a codec to use those bits as a counter which will not advance the
time (eg, to allow multiple events at the same time, which is not
possible with the current system). Setting this shift to zero reverts to
the existing system.

Attached is a simple program that shows the two different methods and
the granules generated by base/offset pairs, along with the number of
bits each requires.

For a timestamping precision of 1ms, as you can see by running the
program, the existing method can't handle a base+offset of more than
about 50 days. If you move the granuleshift around, since the two are
added together, you just move the granulespace from base to offset or
the other way round, without getting more actual space.

With the method I've described, one can choose to lengthen the usable
time space by losing precision on the base. As shown by running the
program, the default setup (25 bits for the base, 2 kill bits) can
encode both a 1157 day base and a 1157 offset at the same time while
keeping the base precision to 10 seconds (the timestamping precision
still being 1 ms, so all times between 0 and 2300 days or so can be
represented at millisecond precision), and seek to any of those times
will be only at most 10 seconds before the intended time. Of course, for
smaller time ranges (such as a two hour movie), the base precision can
be kept to the same millisecond precision.

I believe that the replacement of the current "generic" description of
granule mapping, as described in skeleton, by this sytem should be
considered for the following reasons:

- For the same timestamping precision, that new mapping extends greatly
  the range of times that can be represented

- The new mapping allows the low bits of the granulepos to map to the
  same time, allowing multiple events at the same time

- The user code can choose to extend (to all intents and purposes)
  arbitrarily the representable range of times, at the expense of base
  precision (though the base+offset precision remains the same)

- Ogg does not mandate any particular mapping, so the current system
  overly constrains the range of mappings (the proposed one still
  constrains to a linear mapping, but constrains less).

- Skeleton isn't widely used yet (AFAIK), and CMML is being changed, so
  now is a good time for such a change (well, that's not a reason to
  change, but rather a lack of reason not to change).

- Finally, the proposed mapping is a superset of the currently used one,
  and resolves to it if the kill bits granule shift is zero and the
  granule rate fraction is the same for both base and offset, so
  backwards compatibility is kept for streams encoded with the existing
  method.

For reference, the code difference between the two methods is:

Current method (from libtheora's toplevel.c):

    ogg_int64_t iframe=granulepos>>pbi->keyframe_granule_shift;
    ogg_int64_t pframe=granulepos-(iframe<<pbi->keyframe_granule_shift);

    return (iframe+pframe)*
      ((double)pbi->info.fps_denominator/pbi->info.fps_numerator);

Proposed method (from libkate's kate_granule.c):

  base=granulepos>>ki->granule_shift;
  offset=(granulepos-(base<<ki->granule_shift))>>ki->granule_shift2;
  return base*(kate_float)ki->gps_base_denominator/ki->gps_base_numerator
      +offset*(kate_float)ki->gps_offset_denominator/ki->gps_offset_numerator;