[daala] Maximizing compression efficiency through block rotation

Timothy B. Terriberry tterribe at xiph.org
Fri Jan 2 05:08:31 PST 2015

David ledin wrote:
> It consists of rotating each block, before applying the DCT, by an
> angle that aligns the edges, and rotating back the block in the
> decompression stage. We show how to compute the rotation angle and

not play well with lapped transforms. [1] contains some notes from Monty
on early-stage research on a method that does.

Some notes on the paper:
1) You might want to consider something like Vorbis's square-polar
mapping as a method of rotation that does not create any expansion, does
not require filling in "missing" values, and is perfectly reversible (it
can easily be generalized to arbitrary angles). Monty played around with
this a bit for Daala before being convinced it couldn't work with lapping.

2) Your experiments are over-optimistic because, despite the fact that
you have compacted more energy into fewer coefficients, those
coefficients will be larger and thus more expensive to code. At least
some simple zero-order entropy results would be needed to make them more
believable. Actual gains will be smaller than you report.

3) Although you eschew using a full compression chain like JPEG, I think
leaving it out makes it harder, not easier, to draw conclusions about
the method. Approaches like the DA-PBT have shown that considering
things like coefficient scan order have an impact as large as the
transform itself [2]. Leaving out that analysis makes it unclear whether
or not your approach is capable of achieving similar gains in those stages.

4) At typical video bitrates, the cost of coding the angles becomes
quite significant. You allow up to 90 possibilities, which probably
means north of 6 bits/block even with good entropy coding. We found a
number of intra prediction schemes we've tried became completely
ineffective once you accounted for the cost of coding an intra mode,
even when that cost was in the range of 2...3 bits/block. You simply
can't ignore this cost.

5) W.r.t. [2], Chang also showed in his PhD thesis that a very large
portion of the gain for directional transforms can be achieved by the
plain DCT just by having an adaptive block size (which reduces the
number of significant coefficients around an edge by simply having
smaller basis functions near the edge). Such approaches are, of course,
conceptually and computationally much simpler. Given that, any
comparison of a directional approach to a non-directional one with a
_fixed_ block size seems almost disingenuous. The two tools are not
independent, and you can't consider them separately.

[1] https://people.xiph.org/~xiphmont/demo/daala/aside/edge-filter.html
[2] http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4711712