[Speex-dev] Acoustic echo cancellation

Tue Apr 19 23:25:37 PDT 2011

2011/4/18 Li Maoquan <limaoquan2000 at 126.com>

> >>>> Hi,
> >>>
> >>> I have a scenario in a mobile VoIP app that requires echo cancellation
> but
> >>> is somewhat different from what's described in the docs.
> >>>
> >>> Audio is received from and sent to the network at 8000Hz. Each packet
> >>> contains 160 samples worth a playback of 20ms.
> >>>
> >>> But the hardware requires aggregation for both playback and capture. So
> for
> >>> playback, I coalesce 4 packets in a buffer and queue them as a larger
> buffer
> >>> for playback.
> >>> On the send side, I read a large buffer (worth 4 packets) and send them
> out
> >>> over time 20ms apart.
> >>>
> >>> I tried using speex_echo_playback just when a 160-sample packet arrives
> from
> >>> the network, before coalescing and speex_echo_capture just before a
> packet
> >>> is sent out to the network but that doesn't seem to work properly
> (doesn't
> >>> cancel any echo).
> >>
> >> The most likely reason is that you didn't align the far-end and near-end
> samples.
> >> So the filter can not converge.
> >
> >Thanks for your response. Can you please explain what you mean by
> >align samples from near-end and far-end? And how is that usually
> >accomplished?
>
> You need to know the total delay caused by DAC buffer before speaker, ADC
> buffer
> after microphone and acoustic path between speaker and microphone. Simply
> to say,
> if you play an impluse signal and its first echo appears after N sample
> cycles,
> you can call N as the delay between y (echo in near-end signal) and x
> (far-end
> signal). Then you can buffer far-end signal for N-M cycles before sending
> to AEC.
> M is a little number (such as 100) in order to avoid filter failure when
> echo
> path drifts.
>
>
Thanks again. I am trying to model the delay between the near and far end
signals using a circular queue of length n. Every time a frame is received
and queued for playback, it is also entered into the queue. Each frame being
read from the mic is echo-cancelled ( speex_echo_cancellation ) using the
oldest frame in the queue if the queue is filled up, thus I am cancelling
the recorded frame using a playback frame that is N-frames old.

I have played with different values of N from 2 to 50 (320 samples to 8000
samples), attempting to align the input and output but the cancellation
doesn't seem to work. The echo is steady as ever.

Is this model correct and expected to converge with a right value of "N"? Or
do I need some other adaptation to account for drifts here. Right now, it's
a black box for me. I am not sure how to get some feedback from this system
to tune the AEC (and the delay parameters) correctly.

Also, I did not follow the use of "M" in your description above and how it
helps with drifts. My queue stores frames (160 samples each). So a number of
100 samples seems too small.

Btw, I am assuming that speex AEC API can be used even though I am not using
the speex encoder/decoder.

> >>
> >>>> So, in this scenario above, please recommend a good place to insert
> >>> speex_echo_playback and speex_echo_capture. Should I be just before the
> read
> >>> and write to hardware? In that case, should I use a larger "frame size"
> of
> >>> 160 samples x 4?
> >>
> >> Of course you can set frame size to 160*4. Otherwise you can feed
> samples 4 times
> >> to the AEC if you don't want to modify the frame size.
> >>
> >>>
> >> Thanks in advance,
> >> Daniel.
>
>
>
> ------------------------------
> 体验网易邮箱2G超大附件，轻松发优质大电影、大照片，提速3倍!<http://mail.163.com/html/110414_attachment/att1.htm>
> _______________________________________________
> Speex-dev mailing list
> Speex-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20110419/1236d003/attachment.htm