<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
<body bgcolor="#FFFFFF" text="#000000">
Regarding using probability distribution from the last I-frame, I
have meant remembering probability distribution at the end of this
I-frame (not updated further) - additional buffer which is updated
every I-frame, P-frames use it as the initial probability
But sure, this is memory cost (a few kilobytes) as trade-off for
Regarding memory trade-off for rANS, encoder needs a buffer just to
reverse order - you can use a general memory of encoder for this
The size of this buffer is given by a chosen size of data block
(usually is fixed), the cost is 2-3 bits of loss per block -
completely negligible for a few kilobyte buffer.<br>
From the other side, range coding is a few times more costly for
software decoding. From hardware perspective, multiplications also
require energy, and increase the number of cycles per symbol - needs
increasing frequency (energy).<br>
Here is entire rANS decoding step for 32bits, 16 bit renormalization
(can be reduced to e.g. 6bit x 10bit -> 16bit multiplication with
<pre>s =symbol(x & mask); // SIMD to find s such that CDF[s] <= (x & mask) < CDF[s+1]
x = (CDF[s+1] - CDF[s]) * (x >> n) + (x & mask) - CDF[s];
if (x < 2^16) x = x << 16 + read16bits(); // renormalization
The only branch can be changed to
b = (x < 2^16);
x = x << (y << 4) + (*stream) >> ((1 xor b) << 4);
stream += b;
Good to hear that you plan to finally fix optimized initial
probability distributions and rate for separate IDs - sure, it
should be done after deciding everything else.<br>
On 16/04/27 18:19, Jean-Marc Valin wrote:<br>
<span style="white-space: pre;">> On 04/27/2016 09:14 AM, Jarek Duda wrote:
>> Regarding starting every frame with flat distribution, I believe
>> you still use non-I-frames: which require some I-frame for
>> decoding. So maybe at least start with the probability distribution
>> from this recent required I-frame?
> OK, so here's what happens in a practical videoconferencing
> situation. You're sending I-frames at (e.g.) 15 second interval, but
> 5sec after an I-frame, you lose a packet for a P-frame. It means you
> have to "conceal" (guess) what was in that packet and make up
> something plausible on the screen (obviously not perfect), then you
> decode the next P-frames and any "error" you did in the concealment
> gets spread around as the motion vectors carry them away. It's not
> pretty, but you can still figure out what's going on, and if a big
> change happens you can still see what's happening.
> Now, with what you're suggesting, everything that happens after the
> loss is completely undecodable and you have no idea what's going on.
> In that case, the decoder has no choice but to completely freeze the
> image for 10 seconds until the next I-frame. That's why people don't
> rely (or at least want to have the option of not relying) on anything
> from previous frames for entropy decoding.
>> Another option is varying rate - start with e.g. rate=2,3 for some
>> number of symbols to quickly get out of the terrible flat
>> distribution, then rise it to e.g. 5 for more subtle adaptation to
>> local situation.
> This has *always* been what we've done. The adaptation on the first
> symbols is very fast because we accumulate probabilities starting
> from very small flat values. For example, we would initialize a PDF
> with a value of 32 for each symbol and every time we encode a symbol,
> we boost the corresponding bin by 128. The flat distribution really
> doesn't stay around for long. In the case where the probability
> distribution has to always sum to a power of two, then I have a way
> to adjust the adaptation rate to have the same behaviour as the
> accumulation I described.
>> However, still the looking best approach here is to choose
>> separate optimal initial distribution for each ID - just average
>> over sample videos and fix these probabilities in the codec
> As I said, we might do that someday... but it's a *lot* of work
> (that you have to redo frequently) for a small gain.
>> Also, it seems beneficial to separately optimize 'rate' for each ID
>> and again fix it in the standard.
> Also, on the TODO list.
>> A separate suggestion is to take a look at sequences from your data
>> - here is example from single frame of you sample for ID=0:
It clearly suggests we have two very different behaviors here - it
<span style="white-space: pre;">>> be beneficial to split it into at least 2 separate IDs.
> What you saw was basically the symbol that says whether a superblock
> of the image was well predicted or if we need to code something. It's
> going to be different for every clip and since it's one of the first
> symbols being coded, there isn't much "context" to use for it. In
> this particular case, adapting faster helps. In other clips it
>> Good to hear that you have not only switched to accurate entropy
>> coder, but also to exponential forgetting, which is much better at
>> adapting to local situation. Also you don't longer need the costly
>> non-power-of-two denominator.
> Well, we're still investigating all options. The non-power-of-two
> code always had exponential forgetting (but it was in steps, which is
> less accurate, but also suffers from less rounding issues so it's
> about as good). We now also have a way of making the overhead very
> small even for non-power-of-two without using divisions. So the
> jury's still out -- both options are viable, including dozens of
> variants and combinations.
>> For power-of-two denominator, standard range coder needs two
>> multiplications per symbol, rANS only one, has simpler multi-bit
>> renormalization and allows for further optimizations - leading to
>> ~3x faster/cheaper (software) decoding (also reducing frequency of
>> hardware decoder) - here are optimized implementations:
>> <a class="moz-txt-link-freetext" href="https://github.com/jkbonfield/rans_static">https://github.com/jkbonfield/rans_static</a> The inconvenience is
>> indeed that encoder needs a buffer for backward encoding within a
>> data block (e.g. a frame or its part) - additional a few kilobyte
>> buffer in costly video encoder seems negligible (?).
> Well, these days multiplier hardware is actually cheaper than memory
> (you'll need a lot more than a few kB to cover the worst case).
dr Jarosław Duda<br>
Institute of Computer Science and Computer Mathematics,<br>
Jagiellonian University, Cracow, Poland<br>
<a class="moz-txt-link-freetext" href="http://th.if.uj.edu.pl/~dudaj/">http://th.if.uj.edu.pl/~dudaj/</a><br>