[Speex-dev] Re: aec

Fri Nov 11 10:00:31 PST 2005

> To everyone on the list: do *NOT* attempt to do echo cancellation with
> signals sampled using different clocks. This will *NOT* work. Just a
> 0.1% difference between the two sampling rates (it's sometimes worse
> than that) means that the impulse response drifts by 8 samples every
> second. There's just no way to efficiently track this.  Or at least no
> way that doesn't involve something 100x more complex than what I have
> now. So if you want to use two different soundcards, you should either:
> 1) Have professionals cards connected with a clock sync
> 2) Have an atomic clock in both cards
> 3) Forget about echo cancellation

On Win32, the proper solution is to use ASIO. This, unfortunately, 
requires a high end card.

However, I've had some limited success using two identical soundblaster 
live cards, card A (mic) and card B (output card which samples speakers). 
On each event (sampling capture position passed a marker in the buffer):

- Fetch hardware sample-accurate sampling position of card A and B as PosA 
and PosB.
- Copy 320 samples from buffer of card A at offset WantPosA into "mic
   input".
- Calculate "negative offset" SampOfs = PosA-WantPosA.
- Copy 320 samples from card B at offset PosB-SampOfs into "speaker
   input".
- WantPosA += 320
- Wait for next event and repeat.

This means that even if the cards drift a little (and they do, even with 
identical cards), the drift is compensated for. Kind of.

The downside of this "solution" is that it requires the ability to read 
the *hardware* sampling position at sample-accurate resolution, something 
few cards support (you can basically forget any kind of USB mic and most 
onboard soundcards).

It works, but every 20 minutes or so the AEC gets confused and 
has to be manually reset. And while I don't have any "hard data" to back 
it up, it is my perception that users with ASIO (all sampled on one card 
with synchronous clock) get much better echo cancellation.

PS: Unless it was made clear earlier; you don't need to sample the 
speakers unless you want to also cancel out sounds made by other programs. 
If all you want is to cancel out the sound of the "remote end", just feed 
the AEC the last frame you decoded and sent to the speakers.