[theora] Ogg index and Skeleton 4.0

Thu Apr 29 03:21:48 PDT 2010

> I think that discussion mostly happened on IRC around November last year.
>
> Initially I had a single index which merged all stream's keypoints. But in
> cases where you have sparse streams, like Kate streams, this can result in
> seeking back much more than is desirable; a web client may prefer to open a
> second connection to request that data instead. I switched to multiple
> separate indexes so that a client could potentially decide its own behaviour
> in such a situation.

Yes, that is my feeling too.

>> I assume that the streams with and without 'keyframes' are clearly
>> distinguished, correct?
>>
>
> A client which mapped serialnos to codecs could make this distinction. Why
> would you need to?

Just to have it in Skeleton somewhere so that you wouldn't necessarily
need a list of codec mappings.  There's the tension of the original
design (always ask the codec programmatically) and the Skeleton
(replicate the data when encoded so there's no need to ask the codec),
so it seemed sensible to have this declared in the skeleton as well.

>> It might be worth mentioning or explicitly addressing PTS and DTS
>>
>
> Sorry, I'm not sure what you mean by this.

What I mean is, is there the unstated assumption that PTS==DTS for keyframes?

>> Is the rational timebase required to have any relation to the timebase
>> of the granulepos itself?
>>
>
> Not explicitly. I imagine an indexer would do so in order to preserve
> accuracy though.

OK.  I'm not actually a fan of there being several ways to do things
and just leaving it up to the implementor to figure out.  Is there a
reason not to just use the stream timebase?

> For exact seeking you need to decode forwards from the lowest offset, not
> the earliest timestamp. If you don't decode from the lowest offset, you
> can't guarantee that you've started decoding before the start of all the
> data required to render the seek target on all streams.

I am thoroughly misunderstanding what 'the lowest offset' is then.  I
read that and interpreted it to mean 'look at all the syncpoints.
Choose the latest one' ie, the one closest to the desired time.  What
did you actually mean?

>>> The byte offsets stored in keypoints are relative to the start of the Ogg
>>> bitstream segment.
>>>
>>
>> Clarify-- the beginning of the bitstream (eg, 0 for the first segment)
>>
>
> Beginning of bitstream, so 0 for the first segment, and from the start of
> each fisbone packet in each subsequent segment/link in a chained ogg.

This means you need to know how long the index will be before encoding
any offsets.  But the offsets are variable-length-encoded... How are
you predicting index size?

Monty