[daala] Curious about progress of codec

Wed Apr 27 21:29:20 UTC 2016

Indeed the inhomogeneity of your sequences suggests that they might be 
worth splitting into separate IDs.
For example after 0 in the ID=0 sequence it was more likely that there 
will be another zero - you could for example use this (previous_symbol 
== 0) binary possibility to split this ID into two new ones.

However, still starting each frame with probability density averaged 
over 10 frames I usually got a few or more percent better ratio than 
starting with flat distribution at your sample data:

{ID, ID_label, your_bits, your_perfect_bits, my_starting_with_flat, 
my_starting_with_averaged}
  {0, "skip", 2189.13, 2255.69, 1946.32, 1927},
  {11, "skip", 1103.75, 1144.49, 1027.29, 928},
  {28, "skip", 713.875, 728.601, 786.98, 654},
  {39, "skip", 2539.5, 2546.98, 2544.23, 2430},
  {44, "skip", 942.25, 954.635, 1054.01, 876},
  {50, "skip", 3264.5, 3259.55, 3287.64, 3168},
  {54, "skip", 2376.88, 2391.42, 2544.36, 2280},
  {61, "skip", 1729.13, 1738.28, 1818.28, 1635},
  {77, "mv:low", 773.625, 850.445, 503.3, 381},
  {81, "mv:valid", 90.75, 100.012, 85.1375, 41},
  {96, "mv:ref", 1251.25, 1228.22, 1306.1, 1175}

Cheers,
Jarek

On 16/04/27 22:12, Jean-Marc Valin wrote:
> On 04/27/2016 03:38 PM, Jarek Duda wrote:
>> Sure there can happen probability distribution close to uniform, in
>> which case starting with flat distribution is reasonable ... but
>> definitely not always, especially for your data which for many IDs
>> mainly consists of zeroes - you don't have to pay with ratio to learn
>> that - you can start with a distribution characteristic for a given ID
>> instead (and also optimized adaptation rate).
> Note that in the example Tim gave, the actual distribution was very much
> non-flat. It's just that the variable-speed initial adaptation was
> giving us more gains than we lost from the initial flat probability.
>
> Seriously, I would recommend you try it for yourself in the context of a
> real codec like Daala. I can show you where the important symbols are
> coded and you can try coming up with a better scheme that you can then
> test on a variety of real sequences. It's quite possible you'll be able
> to come up with something better than what we have now, but it's also
> not as easy as it may seem from looking at the data out of context.
>
> Cheers,
>
> 	Jean-Marc
>

-- 
dr Jarosław Duda
Institute of Computer Science and Computer Mathematics,
Jagiellonian University, Cracow, Poland
http://th.if.uj.edu.pl/~dudaj/