[CELT-dev] On guessing theta
Benjamin M. Schwartz
bmschwar at fas.harvard.edu
Thu Mar 19 22:01:25 PDT 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
So you're the encoder. Tou get two vectors in (for some band), L and R.
One thing you could do with this is compute M = L+R and S = L-R. (Yes, I
know, this is not how the encoder actually works. Bear with me.) Then let
m = normalize(M) and s = normalize(S).
You transmit, m, s, |L|, and |R|.
The decoder needs to find unknown positive constants a and b to compute
L = a*m + b*s
R = a*m - b*s
To find a and b, we use two constraints
|L|^2 = |a*m + b*s|^2
|R|^2 = |a*m - b*s|^2
That proceeds as follows:
|a*m + b*s|^2 = a^2*|m|^2 + b^2*|s|^2 + 2*a*b*dot(m,s)
= a^2 + b^2 + 2*a*b*dot(m,s) = |L|^2
|a*m - b*s|^2 = a^2*|m|^2 + b^2*|s|^2 - 2*a*b*dot(m,s)
= a^2 + b^2 - 2*a*b*dot(m,s) = |R|^2
We now compute the sum and difference:
sum:
2*(a^2 + b^2) = |L|^2 + |R|^2
a^2 + b^2 = (|L|^2 + |R|^2)/2
difference:
4*a*b*dot(m,s) = |L|^2 - |R|^2
a*b = (|L|^2 - |R|^2)/(4*dot(m,s))
Combining these equations again in two ways:
a^2 + b^2 + 2*a*b = (|L|^2 + |R|^2)/2 + (|L|^2 - |R|^2)/(2*dot(m,s))
a + b = sqrt((|L|^2 + |R|^2)/2 + (|L|^2 - |R|^2)/(2*dot(m,s)))
a^2 + b^2 - 2*a*b = (|L|^2 + |R|^2)/2 - (|L|^2 - |R|^2)/(2*dot(m,s))
a - b = sqrt((|L|^2 + |R|^2)/2 - (|L|^2 - |R|^2)/(2*dot(m,s)))
The remainder of the solution is left as an exercise.
Anyway, the point is: in principle, _if_ M = L+R, then you don't need to
transmit theta. The solution, ultimately, is equivalent to
theta = (1/2)arcsin(((|L|-|R|)/(|L|+|R|))*(1/dot(m,s)))
Apart from the interesting debate over whether to use M = L+R or M =
normalize(L) + normalize(R), there's one other obvious issue. This
calculation relies on computing dot(m,s). Since m and s are coded with
error, the calculation of theta will also have error.
Some quantization error in theta doesn't seem intrinsically unreasonable,
but if m and s have enough error, the above procedure can derive a
contradiction. For PVQ with a small number of pulses, it seems likely
that dot(m,s) could be zero, even though |L| != |R|. The decoder then
finds itself in a very awkward situation. I believe this can be remedied
with a bit of edge-case handling, though.
- --Ben
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
iEYEARECAAYFAknDIyUACgkQUJT6e6HFtqRPowCghKBuJcPotFdd1lPKoK0ngfSw
rHUAn3NcTW/qP3ckJzwd5qoPG//gdBD6
=xyUt
-----END PGP SIGNATURE-----
More information about the celt-dev
mailing list