[daala] Curious about progress of codec
BGB
cr88192 at gmail.com
Sun Apr 24 19:52:00 UTC 2016
(jumping in, but not directly involved with this project, but have my
own video codecs).
On 4/24/2016 12:21 PM, Jean-Marc Valin wrote:
> On 04/24/2016 10:43 AM, Jarek Duda wrote:
>> 1) Start with a probability distribution characteristic for a given ID,
>> for example a fixed parametric, eventually modified somewhere in the
>> file (better behaves in the beginning and allows for more subtle
>> adaptation),
> Actually, all attempts at doing this so far have failed to produce
> improvement over the current "horrible" flat initialization. So I
> actually measured how much we lose to that flat initialization in one of
> the worst cases you can have: a binary symbol where one value has
> probability 1 and the other has probability 0. In that case, despite
> initializing with a flat CDF, the total cost of coding N symbols is only
> 1 + 1/4*log2(N). So it costs (e.g.) 3 bits to code 256 symbols.
> Considering that it avoids having to code an initial probability, that
> overhead is pretty reasonable.
my general experiences agree.
though, generally, I am using much tackier technologies, but they work.
similarly, best general results were by picking some "sane defaults" for
the initial state and going from there.
one general design for an adaptive entropy coder I have had some success
with:
symbols are fed through a cheap approximation of MTF (SMTF), mapping
symbols to indices;
these indices are encoded using an Adaptive Rice coder (potentially
with length-limiting);
the bits for the AdRice step can be optionally fed through a bitwise
range coder.
this gaining some compression, for a bit of a speed hit, so I
rarely use the range coder.
while not otherwise super impressive sounding, it can get results (in
terms of both speed and compression) reasonably similar to using static
Huffman, and static Huffman + a bitwise range coder (though, admittedly,
the speed with this is more on par with a crappier table-driven static
Huffman, not like Zhuff or similar).
like with other adaptive codings, there is no need to transmit a Huffman
table or similar.
for some payloads, the SMTF+AdRice actually compresses better than
Huffman as well, though Huffman still tends to win on average.
note that the AdRice coder is (in the raw bitstream case) implemented
very similarly to a table-driven Huffman coder.
the MTF approximation mostly boils down to swapping elements in a symbol
table and occasionally rotating the table (via masking and an offset).
this part is basically shimmed onto the Rice encode/decode step. unlike
a true MTF, there is no need to use a loop to shift table contents around.
likewise, the entropy-coding step can be made mostly branch-free (or
fully branch free with big enough tables).
well, nevermind if the range-coder is used, which basically craps all
over everything (then one is back to feeding individual bits through the
range coder, but at least Rice coding maps to a bitwise range coder a
little cleaner than Huffman does). but, at least the front-end entropy
coder reduces the cost of the range coder (by feeding less bits through it).
though, yes, potentially this is all still pretty tacky.
I have doubts it would be directly applicable to Daala, as my codecs
have different design goals (typically focusing more on speed than on
Q/bpp, like trying to encode 4K in real-time on the CPU and similar...).
More information about the daala
mailing list