[daala] Maximizing compression efficiency through block rotation

Fri Jan 2 05:08:31 PST 2015

David ledin wrote:
> It consists of rotating each block, before applying the DCT, by an
> angle that aligns the edges, and rotating back the block in the
> decompression stage. We show how to compute the rotation angle and

Sadly, these approaches aren't that helpful for Daala because they do 
not play well with lapped transforms. [1] contains some notes from Monty 
on early-stage research on a method that does.

Some notes on the paper:
1) You might want to consider something like Vorbis's square-polar 
mapping as a method of rotation that does not create any expansion, does 
not require filling in "missing" values, and is perfectly reversible (it 
can easily be generalized to arbitrary angles). Monty played around with 
this a bit for Daala before being convinced it couldn't work with lapping.

2) Your experiments are over-optimistic because, despite the fact that 
you have compacted more energy into fewer coefficients, those 
coefficients will be larger and thus more expensive to code. At least 
some simple zero-order entropy results would be needed to make them more 
believable. Actual gains will be smaller than you report.

3) Although you eschew using a full compression chain like JPEG, I think 
leaving it out makes it harder, not easier, to draw conclusions about 
the method. Approaches like the DA-PBT have shown that considering 
things like coefficient scan order have an impact as large as the 
transform itself [2]. Leaving out that analysis makes it unclear whether 
or not your approach is capable of achieving similar gains in those stages.

4) At typical video bitrates, the cost of coding the angles becomes 
quite significant. You allow up to 90 possibilities, which probably 
means north of 6 bits/block even with good entropy coding. We found a 
number of intra prediction schemes we've tried became completely 
ineffective once you accounted for the cost of coding an intra mode, 
even when that cost was in the range of 2...3 bits/block. You simply 
can't ignore this cost.

5) W.r.t. [2], Chang also showed in his PhD thesis that a very large 
portion of the gain for directional transforms can be achieved by the 
plain DCT just by having an adaptive block size (which reduces the 
number of significant coefficients around an edge by simply having 
smaller basis functions near the edge). Such approaches are, of course, 
conceptually and computationally much simpler. Given that, any 
comparison of a directional approach to a non-directional one with a 
_fixed_ block size seems almost disingenuous. The two tools are not 
independent, and you can't consider them separately.

[1] https://people.xiph.org/~xiphmont/demo/daala/aside/edge-filter.html
[2] http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4711712