[theora-dev] Various typo in spec

Jonathan Fabrizio jonathan.fabrizio at lrde.epita.fr
Fri Mar 25 03:00:14 PDT 2011


Le 24/03/2011 22:14, Timothy B. Terriberry a écrit :
>> I have few questions on the specification.
>>
>> * Just to understand :
>> - What is the interest of super blocks ? Is it to save place when
>> recording coded block flags (7.3) ?
> That is basically their only purpose, though they're also the level at
> which the Hilbert traversal that makes up "coded order" is defined.
>
>> - What is the advantage of using the coded order ? it is more often
>> easier with raster order (especially to find neighbor in 7.8.1 for
>> exemple). Is it to simplify the correspondence between block, macro
>> block and super block ?
> The theory was supposed to be something like: every block is an
> immediate neighbor of the preceding block in "coded order", except when
> you reach the end of a super-block row (i.e., the traversal has a high
> fractal dimension). This was actually an idea I'd had before I ever even
> heard of VP3, but I rejected it for being too complex to be worth it
> (although I was thinking of a traversal of the entire image, rather than
> restricting it to four rows at a time... that turns out to be a lot more
> complicated, though, especially once you start trying to handle
> non-square sizes where the number of blocks is not a power of two).
>
> Theory aside, coded order mostly adds a lot of bookkeeping overhead to
> the code, and makes some steps of encoder optimization _really_ _hard_.
> One of my biggest pet peeves with the format is that DC prediction is
> not done in coded order. I doubt it really has a big influence on
> compression performance: even in raster order, everything except the
> last block in a row is an immediate neighbor of its predecessor, though
> it's always a horizontal neighbor. It might be appreciably worse if
> there's strong vertical but no horizontal correlation... though it will
> perform better if there's strong horizontal but no vertical correlation.
> On average I expect it's pretty close to a wash.
>
> On2 dropped the Hilbert curve idea in later VPx formats. We could have
> done so ourselves when moving from VP3 to Theora, but back in the days
> when those decisions were made, trivial lossless VP3->Theora transcoding
> (basically, just fixing up a few header bits) was seen as a desirable
> feature, both for easy access to content and for IPR safety reasons.
Thank you for all these explanations.

>> “Each component can take on integer values from −31 . . . 31, inclusive, at
>> half-pixel resolution, i.e. −15.5 . . . 15.5 pixels in the luma plane.
>> For each sub-
>> sampled axis in the chroma planes, the corresponding motion vector
>> component
>> is interpreted as being at quarter-pixel resolution, i.e. −7.75 . . .
>> 7.75 pixels. The
> I agree the wording here is not very clear.
>
I think this explanation is clear enough. I understand it immediately. 
Maybe the text can still be improved, I suggest :

"Each component can take on integer values from −31 . . . 31, inclusive. 
The corresponding motion vector component is interpreted as being at 
half-pixel resolution, i.e. −15.5 . . . 15.5 pixels except for each 
sub-sampled axis in the chroma planes, the corresponding motion vector 
component is interpreted as being at quarter-pixel resolution, i.e. 
−7.75 . . . 7.75 pixels."

I think the issue is in the algorithm because it does not do what it is 
said as the decoded integer value is used directly.
Maybe it lacks in 7.9.4 something like this :
2.(d).(vi).
    B.
     if the x axis of the plane that contains bi is sub-sampled then dx 
= 4.0
     else dx = 2.0
     if the y axisof the plane that contains bi is sub-sampled then dy = 4.0
     else dy = 2.0

     C. assign MVX the value
     floor(abs(MVECTS[bi]x / dx)) * sign(MVECTS[bi]x)

     D. assign MVY the value
     floor(abs(MVECTS[bi]y / dy)) * sign(MVECTS[bi]y)

     E. assign MVX2 the value
     ceil(abs(MVECTS[bi]x / dx)) * sign(MVECTS[bi]x)


     F. assign MVY2 the value
     ceil(abs(MVECTS[bi]y / dy)) * sign(MVECTS[bi]y)
(...)

The confusion (for me) comes from the difference between the text and 
the algorithm. The text alone is understandable (Even if it is not perfect).


Thank you for all the explanations.
J.F.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/theora-dev/attachments/20110325/e7a7e378/attachment.htm 


More information about the theora-dev mailing list