[daala] Curious about progress of codec

BGB cr88192 at gmail.com
Sun Apr 24 19:52:00 UTC 2016


(jumping in, but not directly involved with this project, but have my 
own video codecs).


On 4/24/2016 12:21 PM, Jean-Marc Valin wrote:
> On 04/24/2016 10:43 AM, Jarek Duda wrote:
>> 1) Start with a probability distribution characteristic for a given ID,
>> for example a fixed parametric, eventually modified somewhere in the
>> file (better behaves in the beginning and allows for more subtle
>> adaptation),
> Actually, all attempts at doing this so far have failed to produce
> improvement over the current "horrible" flat initialization. So I
> actually measured how much we lose to that flat initialization in one of
> the worst cases you can have: a binary symbol where one value has
> probability 1 and the other has probability 0. In that case, despite
> initializing with a flat CDF, the total cost of coding N symbols is only
> 1 + 1/4*log2(N). So it costs (e.g.) 3 bits to code 256 symbols.
> Considering that it avoids having to code an initial probability, that
> overhead is pretty reasonable.

my general experiences agree.

though, generally, I am using much tackier technologies, but they work.
similarly, best general results were by picking some "sane defaults" for 
the initial state and going from there.


one general design for an adaptive entropy coder I have had some success 
with:
   symbols are fed through a cheap approximation of MTF (SMTF), mapping 
symbols to indices;
   these indices are encoded using an Adaptive Rice coder (potentially 
with length-limiting);
   the bits for the AdRice step can be optionally fed through a bitwise 
range coder.
     this gaining some compression, for a bit of a speed hit, so I 
rarely use the range coder.

while not otherwise super impressive sounding, it can get results (in 
terms of both speed and compression) reasonably similar to using static 
Huffman, and static Huffman + a bitwise range coder (though, admittedly, 
the speed with this is more on par with a crappier table-driven static 
Huffman, not like Zhuff or similar).

like with other adaptive codings, there is no need to transmit a Huffman 
table or similar.
for some payloads, the SMTF+AdRice actually compresses better than 
Huffman as well, though Huffman still tends to win on average.


note that the AdRice coder is (in the raw bitstream case) implemented 
very similarly to a table-driven Huffman coder.

the MTF approximation mostly boils down to swapping elements in a symbol 
table and occasionally rotating the table (via masking and an offset). 
this part is basically shimmed onto the Rice encode/decode step. unlike 
a true MTF, there is no need to use a loop to shift table contents around.

likewise, the entropy-coding step can be made mostly branch-free (or 
fully branch free with big enough tables).


well, nevermind if the range-coder is used, which basically craps all 
over everything (then one is back to feeding individual bits through the 
range coder, but at least Rice coding maps to a bitwise range coder a 
little cleaner than Huffman does). but, at least the front-end entropy 
coder reduces the cost of the range coder (by feeding less bits through it).


though, yes, potentially this is all still pretty tacky.

I have doubts it would be directly applicable to Daala, as my codecs 
have different design goals (typically focusing more on speed than on 
Q/bpp, like trying to encode 4K in real-time on the CPU and similar...).




More information about the daala mailing list