<div dir="ltr"><div>Hi all,</div><div><br></div><div>I'm having trouble reconciling the coarse energy predictor's z-transform in the paper[0]/RFC and the corresponding code in libopus 1.3.1[1]. I'm pretty new to DSP theory and dealing with z-transforms, but I'm interested in learning (as well as compression), so I thought I'd study this filter. But I just can't seem to get it to match my understanding of the code; it's likely I've made a few mistakes, and any help/guidance would be greatly appreciated!<br></div><div><br></div><div>Note that this is a bit difficult to describe without proper typesetting, so I've prepared some pdf notes (as well as lyx source) and attached them to this email, as well as a pdf render. In case that doesn't reach you, they're also available on my dropbox:</div><div>pdf: <a href="https://www.dropbox.com/s/d3erbl9oc4r4wu7/predictor-confusion-2.pdf?dl=0">https://www.dropbox.com/s/d3erbl9oc4r4wu7/predictor-confusion-2.pdf?dl=0</a></div><div>lyx: <a href="https://www.dropbox.com/s/9lxjliqfexe9vz0/predictor-confusion-2.lyx?dl=0">https://www.dropbox.com/s/9lxjliqfexe9vz0/predictor-confusion-2.lyx?dl=0</a></div><div><br></div><div>Finally, if THAT doesn't work, plaintext-with-tex-mixed-in version follows.</div><div><br></div><div>Thanks for your time/help,</div><div>Jake<br></div><div><br></div><div>---</div><div><br></div><div>I'm having trouble reconciling the coarse energy predictor <br>implementation in the libopus source code and the 2D z-transform <br>description in the paper[0].<br><br>I've simplified the source code (in unquant_coarse_energy in <br>quant_bands.c in libopus 1.3.1[1]) to the following C-like pseudocode:<br><br>void unquant_coarse_energy(float *e, int bands) {<br> float alpha = /* ... */;<br> float beta = /* ... */;<br> float p = 0.0f;<br> for (int b = 0; b < bands; b++) {<br> float q = /* read from bitstream */;<br> e[i] = alpha * e[i] + p + q;<br> p = p + q - beta * q;<br> }<br>}<br><br>According to the paper, the 2D z-transform should be:<br><br>A(z_{\ell},z_{b})=(1-\alpha z_{\ell}^{-1})\cdot\frac{1-z_{b}^{-1}}{1-\beta z_{b}^{-1}}<br><br>First off, to state what I think is obvious: the domain of this <br>filter should be a 2D “energy plane” with the \ell-dimension <br>representing frames and the b-dimension representing bands, and <br>the range should be the prediction (actual band energy - q[\ell,b]<br>, the residual). As a predictor, the filter must be causal. <br>Finally, according to the code above, the energy is always 0 for b<0<br> (\ell<0, b\geq bands, and \ell\geq frames are not specified nor <br>useful).<br><br>Assuming this filter is separable, we first have the \ell<br>-dimension predictor:<br><br>A(z_{\ell})=1-\alpha z_{\ell}^{-1}<br><br>At first, I thought this was clearly embodied by alpha * e[i] <br>above. However, the z-transform implies that it should actually <br>be (1 - alpha) * e[i], so already we seem to be missing another e[i]<br> term somewhere (not to mention alpha having the wrong sign).<br><br>The b-dimension predictor seems even more problematic:<br><br>A(z_{b})=\frac{1-z_{b}^{-1}}{1-\beta z_{b}^{-1}}<br><br>This matches what's listed in the CELT blog post[2], and is equivalent to:<br><br>Y(z_{b})=\frac{1-z_{b}^{-1}}{1-\beta z_{b}^{-1}}X(z_{b})<br><br>The equivalent difference equation is:<br><br>y[b]=x[b]-x[b-1]+\beta y[b-1]<br><br>And substituting names from the C code, we should get something <br>like:<br><br>prev[b]=q[b]-q[b-1]+\beta prev[b-1]<br><br>Now, it should be mentioned that I actually asked about this <br>recently in the DSP stack exchange[3] (after first emailing Jean-Marc Valin directly, but I seem to <br>have scared him off with another wall of text similar to this <br>one), and a helpful user there was able to clarify many things. <br>We actually arrived at the same difference equation in the end, <br>even though we got there a bit of a different way (one which <br>actually included both dimensions from the original 2D z<br>-transform), which suggests that my analysis above is correct.<br><br>However, we still didn't figure out the last bit: reconciling it <br>with the C code; it appears to differ. If I forget about the <br>above and just read the C code, we should get:<br><br>prev[b]=prev[b-1]+q[b]-\beta q[b]<br><br>The equivalent z-transform for this difference equation would be:<br><br>A(z_{b})=\frac{1-\beta}{1-z_{b}^{-1}}<br><br>This suggests that the actual predictor description might instead <br>be:<br><br>A(z_{\ell},z_{b})=(1-\alpha z_{\ell}^{-1})\cdot\frac{1-\beta}{1-z_{b}^{-1}}<br><br>However, that still ignores the apparently-missing e[i] term from <br>the \ell-dimension.<br><br>So, what am I missing? One thing that I glossed over above that <br>the first predictor dimenson (\ell) appears to be applied to the <br>band energy directly (as expected), whereas the second predictor <br>dimension (b) appears to be applied to the residual q. Since q <br>can be expressed in terms of the energy and the predictor, I <br>tried several different interpretations and substitutions in <br>various domains in order to describe a predictor in with the 2D “<br>energy plane” as the domain and the prediction as the range, and <br>got some crazy z-transforms that don't look correct; here's a few <br>just for the curious:A(z_{b},z_{\ell})=\frac{1-\beta+\alpha z_{\ell}^{-1}(1-z_{b}^{-1})}{\beta-z_{b}^{-1}}<br><br>A(z_{b},z_{\ell})=\frac{1+\beta z_{b}^{-1}-\alpha z_{\ell}^{-1}(1-z_{b}^{-1})}{(1+\beta)z_{b}^{-1}}<br><br>So, at this point I'm kindof running in circles, and I think I <br>may have done something wrong; at least I'd like to think that's <br>a lot more likely than the paper/RFC/libopus code were out of <br>sync somehow!</div><div><br></div><div>[0]: <a href="https://arxiv.org/abs/1602.04845">https://arxiv.org/abs/1602.04845</a></div><div>[1]:
<a href="https://opus-codec.org/release/stable/2019/04/12/libopus-1_3_1.html">https://opus-codec.org/release/stable/2019/04/12/libopus-1_3_1.html</a></div><div>[2]:
<a href="https://jmvalin.dreamwidth.org/12000.html">https://jmvalin.dreamwidth.org/12000.html</a></div><div>[3]:
<a href="https://dsp.stackexchange.com/questions/75972/having-trouble-interpreting-z-transform-description-of-a-predictor-from-a-codec">https://dsp.stackexchange.com/questions/75972/having-trouble-interpreting-z-transform-description-of-a-predictor-from-a-codec</a>
</div></div>