[Speex-dev] Echo Canceller Memory Usage, Frame Size

Thu May 11 07:17:23 PDT 2006

(from thread Re: [Speex-dev] Speex echo canceller on TI C55 DSP, but this is 
a more general topic)

>> With these changes, the link was successful, using testecho.c with some
>> modifications for the C55 environment.  The code and data memory
>> requirements were a lot more than I had hoped (>20kbytes of dynamic data
>> memory for block size=128, tail length = 1024), and I will probably not 
>> be
>> able to fit it in the production build without some trimming.
>
> Yes, there may be a bit of memory reduction possible here. Of course,
> decreasing the tail length is also a rather easy way.
>
>> 2.  The testecho example uses a frame length and tail size that are 
>> powers
>> of two (128, 1024).  Are there any implications to using sizes which are 
>> not
>> powers of two?  It would be most convenient to use the encoder frame size
>> (160), and some multiple of that for the tail size.  How does the frame 
>> size
>> affect performance (I understand that the tail length determines what 
>> echo
>> signals are cancelable)?
>
> Non powers of two will be a bit slower because of the FFT, but that's
> all. I made sure the echo canceller works with 160, precisely because
> it's the frame size used by Speex. Note that I don't recommend using
> frames more than 20 ms long (at any sampling rate).
>
>> 3.  Do you have any suggestions for code/data memory reduction for the
>> canceller, other than to make the tail length no longer than necessary 
>> (this
>> is a line echo canceller for a local phone, so I should be able to keep 
>> it
>> to 40ms).  I was surprised by the size of the FFT code, but I guess that 
>> it
>> is doing much more than the radix2 version in the TI library.
>
> The FFT code has more than just the radix two, so you can save there. It
> wasn't meant to be an optimized FFT, so if TI supplies you with one,
> it's probably a good idea to use it (that's what fft_wrap is for). Also,
> given that the memory use is almost directly proportional to the tail
> length, reducing that one to 40 ms will make a huge difference.

The overall allocated memory usage of the echo canceler is, in bytes:
4*frame_size *(27 + 5*ceiling(filter_length/frame_size)) + C

Where C = 420 on a TI C55 DSP (16 bit machine), and C = 760 on a TI C64 DSP 
(32 bit machine).

Where the tail length is an integer multiple of the frame size, this reduces 
to:
108*frame_size + 20*filter_length.

So the memory usage is a much stronger function of the frame length than the 
tail length, but both factors are pretty large.  I recall seeing in an 
earlier thread that performance is degraded when the filter length is too 
long because of added noise (but I cannot find that thread at the moment). 
Of course, if the filter length is too short, then it will not be able to 
cancel all of the echo.  For the frame length, however, the tradeoffs are 
not so obvious.  You said the following last week in the thread "Re: speex 
echo cancellation limitations":

"I recommend using frame sizes of about 5-20 ms (samples depend on
sampling rate) and tail lengths of 100-200 ms for acoustic echo. Of
course, line echo would require less than that, but I've focused mainly
on acoustic echo, which is a harder problem (but line echo should work
as well)."

Could you elaborate a bit on the effect of changing the frame size, other 
than memory usage?  In the small test case I have been using, there is a 
20ms delay between the speaker and microphone signals.  Test-echo.c has 
defaults frame_len=128, filter_len=1024.  I ran this case, and also 
frame_len=80, filter_len=320 (10 and 40ms at 8000 Hz).  The second case 
attenuated the echo better, probably because the first filter length is much 
longer than the echo path delay.  How low a frame length would you recommend 
for 8000Hz sample rate?

- Jim