[tremor] Re: [PATCH] new mdct code, redundant tables eliminated
Nicolas Pitre
nico at cam.org
Fri Sep 13 12:23:10 PDT 2002
On Fri, 13 Sep 2002, Monty wrote:
> Patch is on mainline.
>
> It is not bit-identical decode to the original, but upon review, the
> negation shuffling would cause that. This appears to be simple
> integer rounding behavior.
Right, usually the difference is only on the least significant bit in the
final sample.
However the table reduction is a compromise since it won't give perfect
results for all cases. The problematic cases are when n==4096 and n==8192
in the first loop of the "rotate + window" section of the code.
Specifically, at that point, we end up with:
n step step>>1
---- ---- -------
... ... ...
2048 2 1
4096 1 0
8192 0 0
With n up to 2048 everything is perfect. However with n == 4096 or 8192 we
quit the realm of integers.
Affected code on line 408,409:
T =sin_lookup-(step>>1);
V =sin_lookup+2048+(step>>1);
and lines 414 and below, repeated 4 times:
T += step;
V -= step;
When n = 4096 the initial offset in the lookup table becomes wrong since it
would actually need to be 0.5. The current code therefore uses the
equivalent of cos((M_PI/(2*n))*(2*i+2) instead of cos((M_PI/(2*n))*(2*i+1),
with i incrementing from 0 to 511. This will certainly affect the output but
I can't tell how significantly.
Now with n == 8192 things are even worse. Not only the initial table offset
can't be achieved, but also the step value becomes 0 which explains the
little hack on line 424 and 436. We therefore have the equivalent of
cos((M_PI/(2*n))*(4*(i/2)+4) for i == 0 to 1023. This gives us a quite good
approximation nevertheless, but I'm not able to say if that would affect the
output significantly.
I don't know if those cases are that important since I never saw them in
testing and I don't know how to create streams with them. And I don't have
enough knowledge of the fundamental underlying algorithm to determine how
the approximation will affect the math.
The question is: do we care?
Of course the simple solution would involve quadrupling the lookup table,
but a table too sparse would also have negative impact on small cache
systems due to bad locality for the cases where n is small. The text size
reduction achievement would also be compromised.
My preferred alternative would involve linear interpolation within the
current table. This should give pretty good accuracy and not require many
more cycles (only some additions and shifts). However to do it efficiently
separate loops for n=4096 and n=8192 might be best.
But again, do we care?
<p>Nicolas
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'tremor-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Tremor
mailing list