[daala] Why not upgrading Daala range coding with much faster rANS or tANS?
Jarek Duda
dudajar at gmail.com
Sat Aug 15 01:36:33 PDT 2015
> Sure: <https://tools.ietf.org/html/draft-terriberry-netvc-codingtools-00#section-2>.
> The interesting bits for what you are after probably start in Section 2.3.
Thank you, I have looked at the materials but I didn't see the details,
so I had to get to the code.
For a better focus, here is example of the entire rANS decoding step
with renormalization:
s = symbol[x & mask];
x = freq[s] * (x >> n) + (x & mask) - cumul[s];
if(x < 2^16) x = x<<16 | read16bits(); // renormalization
I have looked at entdec.c, the od_ec_decode_cdf function (?):
int od_ec_decode_cdf_(od_ec_dec *dec, const uint16_t *cdf, int nsyms OD_ACC_STR) {
261 od_ec_window dif;
262 unsigned r;
263 unsigned c;
264 unsigned d;
265 #if OD_EC_REDUCED_OVERHEAD
266 unsigned e;
267 #endif
268 int s;
269 unsigned u;
270 unsigned v;
271 unsigned q;
272 unsigned fl;
273 unsigned fh;
274 unsigned ft;
275 int ret;
276 dif = dec->dif;
277 r = dec->rng;
278 OD_ASSERT(dif >> (OD_EC_WINDOW_SIZE - 16) < r);
279 OD_ASSERT(nsyms > 0);
280 ft = cdf[nsyms - 1];
281 OD_ASSERT(16384 <= ft);
282 OD_ASSERT(ft <= 32768U);
283 OD_ASSERT(ft <= r);
284 s = r - ft >= ft;
285 ft <<= s;
286 d = r - ft;
287 OD_ASSERT(d < ft);
288 c = (unsigned)(dif >> (OD_EC_WINDOW_SIZE - 16));
289 q = OD_MAXI((int)(c >> 1), (int)(c - d));
290 #if OD_EC_REDUCED_OVERHEAD
291 e = OD_SUBSATU(2*d, ft);
292 /*The correctness of this inverse partition function is not obvious, but it
293 was checked exhaustively for all possible values of r, ft, and c.
294 TODO: It should be possible to optimize this better than the compiler,
295 given that we do not care about the accuracy of negative results (as we
296 will not use them).
297 It would also be nice to get rid of the 32-bit dividend, as it requires a
298 32x32->64 bit multiply to invert.*/
299 q = OD_MAXI((int)q, (int)((2*(int32_t)c + 1 - (int32_t)e)/3));
300 #endif
301 q >>= s;
302 OD_ASSERT(q < ft >> s);
303 fl = 0;
304 ret = 0;
305 for (fh = cdf[ret]; fh <= q; fh = cdf[++ret]) fl = fh;
306 OD_ASSERT(fh <= ft >> s);
307 fl <<= s;
308 fh <<= s;
309 #if OD_EC_REDUCED_OVERHEAD
310 u = fl + OD_MINI(fl, e) + OD_MINI(OD_SUBSATU(fl, e) >> 1, d);
311 v = fh + OD_MINI(fh, e) + OD_MINI(OD_SUBSATU(fh, e) >> 1, d);
312 #else
313 u = fl + OD_MINI(fl, d);
314 v = fh + OD_MINI(fh, d);
315 #endif
316 r = v - u;
317 dif -= (od_ec_window)u << (OD_EC_WINDOW_SIZE - 16);
318 return od_ec_dec_normalize(dec, dif, r, ret, acc_str);
319 }
Indeed I don't see multiplication by frequencies here.
If I properly understood,
OD_MAXI, OD_MINI is maximum, minimum of two variables
OD_SUBSATU(a,b) = a - min(a,b)
OD_ASSERT is only information for compiler
dif is the start of the range (?)
r is its length
ft gives the proportions to focus inside the range
I have to admit that I don't see how it finds the new range, lines 280 -
291 seem to should do it (?), what is below seem to be renormalization
related (?)
Could you give me some hints to understand it?
> That wouldn't really tell us much useful, because of course what's in
> that benchmark is not how an entropy coder is used in a video codec. The
> relevant test would be to put your alternative into Daala.
It seems you have a general purpose entropy coder, which could be also put in different applications, so maybe it's worth benchmarking alone?
> CABAC is heavily patented and thus not relevant
I only gave it as en example how ANS makes everything simpler/cheaper.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/daala/attachments/20150815/50f5a29a/attachment.htm
More information about the daala
mailing list