[daala] Curious about progress of codec
Jarek Duda
dudajar at gmail.com
Sun Apr 24 16:56:39 UTC 2016
Hi Aaron,
Indeed the statistical modeling + entropy coding part of compressor is a
crucial part of the success.
CABAC went in good direction, but still brings the burden of being
designed on previous CAVLC binarization and lack of inexpensive accurate
multi-symbol entropy coders back then.
Now we can finally do this part in practically optimal way, especially
that rANS has has significantly reduced the cost (and it's patent-free).
Its decoding requires single multiplication (uint32, but uint16 would be
sufficient) per large symbol (max 16 size alphabet here) and can
directly work on CDF-only:
s =symbol(x & mask); // SIMD to find s such that CDF[s] <= (x &
mask) < CDF[s+1]
x = (CDF[s+1] - CDF[s]) * (x >> n) + (x & mask) - CDF[s];
if (x < 2^16) x = x << 16 + read16bits();
Then symbol-wise adaptation is also SIMD-able (max 16 size alphabet) -
for prepared mixCDF[s,i]:
/for (int i = 1; i < m; i++) CDF[i] -= (CDF[i] - mixCDF[s,i]) >> rate;/
VP10 has initially planned to use ANS only for DCT, but now it's everywhere:
https://chromium.googlesource.com/webm/libvpx/+/fb9186d68d498919b290d7f444560f44a1d8d1df
https://chromium.googlesource.com/webm/libvpx/+/c961bcc594a642f31777df725e0c3698de8e4117
Kind Regards,
Jarek
On 16/04/24 18:27, Aaron Boxer wrote:
> Hi Jarek,
>
> Excellent! I hope these efficiency improvements that you are proposing
> will get adopted, or at least investigated.
>
> I don't know much about arithmetic coding, but according to Wikipedia,
> CABAC is considered one of the primary reasons
> for H.264 and H.265 coding efficiency being superior to predecessors.
> So, using rANS and the other techniques you mention
> should give Daala a distinct advantage.
>
> The only reasons I can think of for them *not* being adopted would be:
>
> 1) increased codec complexity (the issue that JPEG 2000 foundered on)
> 2) patents (another issue with JPEG 2000)
> 3) Not Invented Here Syndrome (TM)
>
> Kind Regards,
> Aaron
>
>
>
> On Sun, Apr 24, 2016 at 10:43 AM, Jarek Duda <dudajar at gmail.com
> <mailto:dudajar at gmail.com>> wrote:
>
> Hi Aaron,
>
> I had some discussion and analysis regarding their coding part,
> and there is a few percents to easily squeeze there (5-10%).
> In contrast to nasty CABAC scheme - start with binarization then
> try to model these bits, Daala has very nice general scheme:
> send to coding part sequence of (ID, symbol),
> where symbol is from up to size 16 alphabet,
> each ID is statistically modeled as independent random variable -
> hence ID itself contains both data type and context (like
> neighborhood).
>
> However, they have a few inefficiencies:
> 1) start each frame with flat uniform probability distribution
> (this is the worst!),
> 2) use approximated multi-symbol range coder:
> https://people.xiph.org/~tterribe/tmp/StuiverMoffat98-%20Piecewise%20Integer%20Mapping%20for%20Arithmetic%20Coding.pdf
> <https://people.xiph.org/%7Etterribe/tmp/StuiverMoffat98-%20Piecewise%20Integer%20Mapping%20for%20Arithmetic%20Coding.pdf>
> the cost of this approximation is 1-3% ratio loss for binary
> alphabet: https://dl.dropboxusercontent.com/u/12405967/Moffat.nb
> 3) use costly every symbol adaptation (mainly to handle the flat
> initial distribution problem), but with
> freq[s] = count[s] / total
> poor type of adaptation: a symbol far in the past has the same
> influence on the probability as the most recent symbols,
> 4) use fixed adaptation rate.
>
> These can be improved by:
> 1) Start with a probability distribution characteristic for a
> given ID, for example a fixed parametric, eventually modified
> somewhere in the file (better behaves in the beginning and allows
> for more subtle adaptation),
> 2) use an accurate entropy coder, like rANS in VP10,
> 3) use adaptation with exponential forgetting to make recent
> symbol more important, like
> for (int i = 1; i < m; i++) CDF[i] -= (CDF[i] - mixCDF[i]) >> rate;
> where mixCDF is for the new part, can be tabled for symbol-wise
> adaptation such that frequencies of not used symbol will drop to
> the minimal nonzero frequency,
> 4) Allow for varying adaptation rate - for example some ID use
> more static probability distribution, for some it is beneficial to
> allow encoder to choose one of a few possible adaptation rates.
>
> Kind regards,
> Jarek Duda
>
>
>
>
> On 16/04/24 05:36, Aaron Boxer wrote:
>> Dear Daala-istas,
>>
>> I took a look at the PSNR and PSNR-HVS charts for daala vs H.264
>> and H.265.
>> May I ask at what PSNR value you would consider Daala to be
>> competitive with H.265 ? From the graph, it looks like you are
>> asymptotically matching H.264 quality, but there is still
>> significant difference with HEVC, and progress is flattening out.
>>
>> I don't mean to be that guy, but when do you think you will be
>> able to meet your project goal of meeting or beating HEVC
>> quality? I am asking because I think this is a great project, and
>> want to see it beating out the $$-driven competition.
>>
>>
>> Kind Regards,
>> Aaron Boxer
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> daala mailing list
>> daala at xiph.org <mailto:daala at xiph.org>
>> http://lists.xiph.org/mailman/listinfo/daala
>
>
> --
> dr Jarosław Duda
> Institute of Computer Science and Computer Mathematics,
> Jagiellonian University, Cracow, Poland
> http://th.if.uj.edu.pl/~dudaj/ <http://th.if.uj.edu.pl/%7Edudaj/>
>
>
> _______________________________________________
> daala mailing list
> daala at xiph.org <mailto:daala at xiph.org>
> http://lists.xiph.org/mailman/listinfo/daala
>
>
--
dr Jarosław Duda
Institute of Computer Science and Computer Mathematics,
Jagiellonian University, Cracow, Poland
http://th.if.uj.edu.pl/~dudaj/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/daala/attachments/20160424/12abe7fa/attachment-0001.html>
More information about the daala
mailing list