[theora-dev] further debugging of my ogg/theora decoder

Tue Nov 18 18:48:38 PST 2008

On Tue, Nov 18, 2008 at 1:06 PM, salsaman <salsaman at gmail.com> wrote:

>
>
> On Tue, Nov 18, 2008 at 12:47 PM, salsaman <salsaman at gmail.com> wrote:
>
>>
>>
>> On Tue, Nov 18, 2008 at 7:28 AM, Ralph Giles <giles at xiph.org> wrote:
>>
>>> On Tue, Nov 18, 2008 at 2:21 AM, salsaman <salsaman at gmail.com> wrote:
>>>
>>> > Right, that is what I would expect. I think I will write an ogg file
>>> checker
>>> > that checks all granulepos and keyframes to make sure the problem is
>>> not in
>>> > my test files. Some of the files were created with older versions of
>>> the
>>> > encoder. For example I have one file where I need to subtract 2 from
>>> all
>>> > keyframe values in order for it to decode properly.
>>>
>>> Hmm. Some of the alpha encoder releases recorded the frame index, not
>>> the frame count. Such stream declare their version to be 3.2.0 instead
>>> of 3.2.1 as in the current theora spec.
>>>
>>>  -r
>>>
>>
>> Yes, I think the file
>> http://upload.wikimedia.org/wikipedia/commons/0/0c/Experience_ubuntu.ogg
>>
>> is like that (0 based keyframes).
>>
>> I am seeing other oddities in the file (I think) maybe somebody can
>> confirm or deny this:
>>
>> there is a granulepos 68516 in the file, which is (1070,1106) zero based.
>> If I begin decoding from that page, I get (2 pages) [68516, 68518] 1107 (?),
>> (2 pages) [68518, 70976] 1108 (?)  (keyframe) (2 pages) 1109 (?).
>>
>> However, the keyframe should be 1109 (zero based) according to later gpos.
>>
>> I think what is messing me up is that there are 2 granulepos in a row with
>> the same value. If I start from the second page with gpos of x, then I miss
>> a page (because I should be starting on the first occurance of x).
>>
>> Surely the second page should have granulepos of -1 ? Is this another
>> known issue ?
>>
>> Is this something that ogg_page_packetsout() could solve ?
>>
>>
>>
>> Regards,
>> Gabriel.
>> http://lives.sourceforge.net
>>
>>
>>
>>
>
>
> Yes, in fact I can now answer that myself - checking for ogg_page_packets()
> > 0 does indeed provide a workaround for this particular problem.
>
> Now I need to know how to get the encoder version number from the stream.
> Anybody ?
>
>
>
> Gabriel.
> http://lives.sourceforge.net
>
>
>
>
>

OK, everything is working now ! I decided not to bother getting the version
number - I simply do a test when the file is opened to see what the first
granulepos is after a frame is returned. From this I can calculate the first
keyframe (either 0 or 1).

My decoder looks to be working nicely, with slight adjustments to the method
I described.

Here is the procedure in full for future reference.

1) Parse stream headers to get data

2) Begin reading pages until we get at least one packet. Then check the
granulepos to find the first keyframe number (0 or 1). Store this as
keyframe_offset.

3) To get a target frame, first search through the file to find the largest
granulepos<=target, taking into account the keyframe_offset. From this we
get the keyframe of the target.

4) Subtract 1 from the keyframe, then repeat step 3).

5) Begin reading from the frame discovered in step 4. Drop any packets which
are output on the first page. Count down until we reach the keyframe,
dropping packets until then.

6) Continue counting down until we reach the target frame, we are now
decoding each frame/packet. At the target frame produce the YUV output.

7) If the next frame is > last frame, and <= keyframe_shift/2, continue
decoding until we reach the new target; otherwise go back to step 3, unless
we have the same keyframe (for example playing backwards) as previous, in
which case we go to step 5.

As an additional optimisation, the plugin creates and linked list of
index_entries on the fly, updating it each time a new granulepos with at
least one packet is discovered. However, we only store one index_entry per
keyframe, that is, the highest granulepos we have found for a particular
keyframe.

The format of the index_entry is:

struct _index_entry {
  index_entry *next;
  index_entry *prev;
  int64_t granulepos;
  int64_t pagepos;
};

The index entry list is used to find the keyframe for a frame (if it is in
our list), otherwise it provides lower and upper bounds for seeking in the
file.

In the case of a 64 bit system, each index_entry adds 256 bytes per keyframe
(roughly 4 bytes per frame), or on a 32 bit system, 192 bytes per keyframe
(roughly 3 bytes per frame).

As noted previously, the plugin has a very simple API:
- a function to get the plugin version string
- a function to return data for the URI (or NULL if the URI has no theora
bitstream)
- a function to get a given frame from a given URI

Some of this code might be useful for liboggz or other uses. As ever, the
sourcecode is available at:

http://lives.cvs.sourceforge.net/viewvc/lives/lives-plugins/plugins/decoders/ogg_theora_decoder.c

Regards,
Gabriel.
http://lives.sourceforge.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/theora-dev/attachments/20081118/d398db8b/attachment.htm