[CELT-dev] On guessing theta

Benjamin M. Schwartz bmschwar at fas.harvard.edu
Thu Mar 19 22:01:25 PDT 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

So you're the encoder. Tou get two vectors in (for some band), L and R.

One thing you could do with this is compute M = L+R and S = L-R.  (Yes, I
know, this is not how the encoder actually works. Bear with me.) Then let
m = normalize(M) and s = normalize(S).

You transmit, m, s, |L|, and |R|.

The decoder needs to find unknown positive constants a and b to compute

L = a*m + b*s
R = a*m - b*s

To find a and b, we use two constraints
|L|^2 = |a*m + b*s|^2
|R|^2 = |a*m - b*s|^2

That proceeds as follows:
|a*m + b*s|^2 = a^2*|m|^2 + b^2*|s|^2 + 2*a*b*dot(m,s)
            = a^2 + b^2 + 2*a*b*dot(m,s) = |L|^2

|a*m - b*s|^2 = a^2*|m|^2 + b^2*|s|^2 - 2*a*b*dot(m,s)
           = a^2 + b^2 - 2*a*b*dot(m,s) = |R|^2

We now compute the sum and difference:
sum:
2*(a^2 + b^2) = |L|^2 + |R|^2
a^2 + b^2 = (|L|^2 + |R|^2)/2

difference:
4*a*b*dot(m,s) = |L|^2 - |R|^2
a*b = (|L|^2 - |R|^2)/(4*dot(m,s))

Combining these equations again in two ways:
a^2 + b^2 + 2*a*b = (|L|^2 + |R|^2)/2 + (|L|^2 - |R|^2)/(2*dot(m,s))
a + b = sqrt((|L|^2 + |R|^2)/2 + (|L|^2 - |R|^2)/(2*dot(m,s)))

a^2 + b^2 - 2*a*b = (|L|^2 + |R|^2)/2 - (|L|^2 - |R|^2)/(2*dot(m,s))
a - b = sqrt((|L|^2 + |R|^2)/2 - (|L|^2 - |R|^2)/(2*dot(m,s)))

The remainder of the solution is left as an exercise.

Anyway, the point is: in principle, _if_ M = L+R, then you don't need to
transmit theta.  The solution, ultimately, is equivalent to

theta = (1/2)arcsin(((|L|-|R|)/(|L|+|R|))*(1/dot(m,s)))

Apart from the interesting debate over whether to use M = L+R or M =
normalize(L) + normalize(R), there's one other obvious issue.  This
calculation relies on computing dot(m,s).  Since m and s are coded with
error, the calculation of theta will also have error.

Some quantization error in theta doesn't seem intrinsically unreasonable,
but if m and s have enough error, the above procedure can derive a
contradiction.  For PVQ with a small number of pulses, it seems likely
that dot(m,s) could be zero, even though |L| != |R|.  The decoder then
finds itself in a very awkward situation.  I believe this can be remedied
with a bit of edge-case handling, though.

- --Ben
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)

iEYEARECAAYFAknDIyUACgkQUJT6e6HFtqRPowCghKBuJcPotFdd1lPKoK0ngfSw
rHUAn3NcTW/qP3ckJzwd5qoPG//gdBD6
=xyUt
-----END PGP SIGNATURE-----



More information about the celt-dev mailing list