Simply to say, in a quiet room, you can play a impulse signal and then find it's impulse response signal from the <br>microphone. For example, if the delay between the impulse signal and its response signal range from 500 to<br>3000 cycles, you can buffer the far-end signal to 0-300 cycles and set the filter length to 4000. It is also called<br>to align far-end signal and near-end signal.<br><br>BTW: Speex AEC is sensiive to mismatch between sample rates of capturing and rendering. But most low-cost<br>computer soundcards have this problem.<div></div><br><pre><br>At&nbsp;2011-04-21&nbsp;03:00:01,speex-dev-request@xiph.org&nbsp;wrote:
&gt;&gt;&nbsp;&gt;&gt;&gt;
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;I&nbsp;have&nbsp;a&nbsp;scenario&nbsp;in&nbsp;a&nbsp;mobile&nbsp;VoIP&nbsp;app&nbsp;that&nbsp;requires&nbsp;echo&nbsp;cancellation
&gt;&gt;&nbsp;but
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;is&nbsp;somewhat&nbsp;different&nbsp;from&nbsp;what's&nbsp;described&nbsp;in&nbsp;the&nbsp;docs.
&gt;&gt;&nbsp;&gt;&gt;&gt;
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;Audio&nbsp;is&nbsp;received&nbsp;from&nbsp;and&nbsp;sent&nbsp;to&nbsp;the&nbsp;network&nbsp;at&nbsp;8000Hz.&nbsp;Each&nbsp;packet
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;contains&nbsp;160&nbsp;samples&nbsp;worth&nbsp;a&nbsp;playback&nbsp;of&nbsp;20ms.
&gt;&gt;&nbsp;&gt;&gt;&gt;
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;But&nbsp;the&nbsp;hardware&nbsp;requires&nbsp;aggregation&nbsp;for&nbsp;both&nbsp;playback&nbsp;and&nbsp;capture.&nbsp;So
&gt;&gt;&nbsp;for
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;playback,&nbsp;I&nbsp;coalesce&nbsp;4&nbsp;packets&nbsp;in&nbsp;a&nbsp;buffer&nbsp;and&nbsp;queue&nbsp;them&nbsp;as&nbsp;a&nbsp;larger
&gt;&gt;&nbsp;buffer
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;for&nbsp;playback.
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;On&nbsp;the&nbsp;send&nbsp;side,&nbsp;I&nbsp;read&nbsp;a&nbsp;large&nbsp;buffer&nbsp;(worth&nbsp;4&nbsp;packets)&nbsp;and&nbsp;send&nbsp;them
&gt;&gt;&nbsp;out
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;over&nbsp;time&nbsp;20ms&nbsp;apart.
&gt;&gt;&nbsp;&gt;&gt;&gt;
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;I&nbsp;tried&nbsp;using&nbsp;speex_echo_playback&nbsp;just&nbsp;when&nbsp;a&nbsp;160-sample&nbsp;packet&nbsp;arrives
&gt;&gt;&nbsp;from
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;the&nbsp;network,&nbsp;before&nbsp;coalescing&nbsp;and&nbsp;speex_echo_capture&nbsp;just&nbsp;before&nbsp;a
&gt;&gt;&nbsp;packet
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;is&nbsp;sent&nbsp;out&nbsp;to&nbsp;the&nbsp;network&nbsp;but&nbsp;that&nbsp;doesn't&nbsp;seem&nbsp;to&nbsp;work&nbsp;properly
&gt;&gt;&nbsp;(doesn't
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;cancel&nbsp;any&nbsp;echo).
&gt;&gt;&nbsp;&gt;&gt;
&gt;&gt;&nbsp;&gt;&gt;&nbsp;The&nbsp;most&nbsp;likely&nbsp;reason&nbsp;is&nbsp;that&nbsp;you&nbsp;didn't&nbsp;align&nbsp;the&nbsp;far-end&nbsp;and&nbsp;near-end
&gt;&gt;&nbsp;samples.
&gt;&gt;&nbsp;&gt;&gt;&nbsp;So&nbsp;the&nbsp;filter&nbsp;can&nbsp;not&nbsp;converge.
&gt;&gt;&nbsp;&gt;
&gt;&gt;&nbsp;&gt;Thanks&nbsp;for&nbsp;your&nbsp;response.&nbsp;Can&nbsp;you&nbsp;please&nbsp;explain&nbsp;what&nbsp;you&nbsp;mean&nbsp;by
&gt;&gt;&nbsp;&gt;align&nbsp;samples&nbsp;from&nbsp;near-end&nbsp;and&nbsp;far-end?&nbsp;And&nbsp;how&nbsp;is&nbsp;that&nbsp;usually
&gt;&gt;&nbsp;&gt;accomplished?
&gt;&gt;
&gt;&gt;&nbsp;You&nbsp;need&nbsp;to&nbsp;know&nbsp;the&nbsp;total&nbsp;delay&nbsp;caused&nbsp;by&nbsp;DAC&nbsp;buffer&nbsp;before&nbsp;speaker,&nbsp;ADC
&gt;&gt;&nbsp;buffer
&gt;&gt;&nbsp;after&nbsp;microphone&nbsp;and&nbsp;acoustic&nbsp;path&nbsp;between&nbsp;speaker&nbsp;and&nbsp;microphone.&nbsp;Simply
&gt;&gt;&nbsp;to&nbsp;say,
&gt;&gt;&nbsp;if&nbsp;you&nbsp;play&nbsp;an&nbsp;impluse&nbsp;signal&nbsp;and&nbsp;its&nbsp;first&nbsp;echo&nbsp;appears&nbsp;after&nbsp;N&nbsp;sample
&gt;&gt;&nbsp;cycles,
&gt;&gt;&nbsp;you&nbsp;can&nbsp;call&nbsp;N&nbsp;as&nbsp;the&nbsp;delay&nbsp;between&nbsp;y&nbsp;(echo&nbsp;in&nbsp;near-end&nbsp;signal)&nbsp;and&nbsp;x
&gt;&gt;&nbsp;(far-end
&gt;&gt;&nbsp;signal).&nbsp;Then&nbsp;you&nbsp;can&nbsp;buffer&nbsp;far-end&nbsp;signal&nbsp;for&nbsp;N-M&nbsp;cycles&nbsp;before&nbsp;sending
&gt;&gt;&nbsp;to&nbsp;AEC.
&gt;&gt;&nbsp;M&nbsp;is&nbsp;a&nbsp;little&nbsp;number&nbsp;(such&nbsp;as&nbsp;100)&nbsp;in&nbsp;order&nbsp;to&nbsp;avoid&nbsp;filter&nbsp;failure&nbsp;when
&gt;&gt;&nbsp;echo
&gt;&gt;&nbsp;path&nbsp;drifts.
&gt;&gt;
&gt;&gt;
&gt;Thanks&nbsp;again.&nbsp;I&nbsp;am&nbsp;trying&nbsp;to&nbsp;model&nbsp;the&nbsp;delay&nbsp;between&nbsp;the&nbsp;near&nbsp;and&nbsp;far&nbsp;end
&gt;signals&nbsp;using&nbsp;a&nbsp;circular&nbsp;queue&nbsp;of&nbsp;length&nbsp;n.&nbsp;Every&nbsp;time&nbsp;a&nbsp;frame&nbsp;is&nbsp;received
&gt;and&nbsp;queued&nbsp;for&nbsp;playback,&nbsp;it&nbsp;is&nbsp;also&nbsp;entered&nbsp;into&nbsp;the&nbsp;queue.&nbsp;Each&nbsp;frame&nbsp;being
&gt;read&nbsp;from&nbsp;the&nbsp;mic&nbsp;is&nbsp;echo-cancelled&nbsp;(&nbsp;speex_echo_cancellation&nbsp;)&nbsp;using&nbsp;the
&gt;oldest&nbsp;frame&nbsp;in&nbsp;the&nbsp;queue&nbsp;if&nbsp;the&nbsp;queue&nbsp;is&nbsp;filled&nbsp;up,&nbsp;thus&nbsp;I&nbsp;am&nbsp;cancelling
&gt;the&nbsp;recorded&nbsp;frame&nbsp;using&nbsp;a&nbsp;playback&nbsp;frame&nbsp;that&nbsp;is&nbsp;N-frames&nbsp;old.
&gt;
&gt;I&nbsp;have&nbsp;played&nbsp;with&nbsp;different&nbsp;values&nbsp;of&nbsp;N&nbsp;from&nbsp;2&nbsp;to&nbsp;50&nbsp;(320&nbsp;samples&nbsp;to&nbsp;8000
&gt;samples),&nbsp;attempting&nbsp;to&nbsp;align&nbsp;the&nbsp;input&nbsp;and&nbsp;output&nbsp;but&nbsp;the&nbsp;cancellation
&gt;doesn't&nbsp;seem&nbsp;to&nbsp;work.&nbsp;The&nbsp;echo&nbsp;is&nbsp;steady&nbsp;as&nbsp;ever.
&gt;
&gt;Is&nbsp;this&nbsp;model&nbsp;correct&nbsp;and&nbsp;expected&nbsp;to&nbsp;converge&nbsp;with&nbsp;a&nbsp;right&nbsp;value&nbsp;of&nbsp;"N"?&nbsp;Or
&gt;do&nbsp;I&nbsp;need&nbsp;some&nbsp;other&nbsp;adaptation&nbsp;to&nbsp;account&nbsp;for&nbsp;drifts&nbsp;here.&nbsp;Right&nbsp;now,&nbsp;it's
&gt;a&nbsp;black&nbsp;box&nbsp;for&nbsp;me.&nbsp;I&nbsp;am&nbsp;not&nbsp;sure&nbsp;how&nbsp;to&nbsp;get&nbsp;some&nbsp;feedback&nbsp;from&nbsp;this&nbsp;system
&gt;to&nbsp;tune&nbsp;the&nbsp;AEC&nbsp;(and&nbsp;the&nbsp;delay&nbsp;parameters)&nbsp;correctly.
&gt;
&gt;Also,&nbsp;I&nbsp;did&nbsp;not&nbsp;follow&nbsp;the&nbsp;use&nbsp;of&nbsp;"M"&nbsp;in&nbsp;your&nbsp;description&nbsp;above&nbsp;and&nbsp;how&nbsp;it
&gt;helps&nbsp;with&nbsp;drifts.&nbsp;My&nbsp;queue&nbsp;stores&nbsp;frames&nbsp;(160&nbsp;samples&nbsp;each).&nbsp;So&nbsp;a&nbsp;number&nbsp;of
&gt;100&nbsp;samples&nbsp;seems&nbsp;too&nbsp;small.
&gt;
&gt;Btw,&nbsp;I&nbsp;am&nbsp;assuming&nbsp;that&nbsp;speex&nbsp;AEC&nbsp;API&nbsp;can&nbsp;be&nbsp;used&nbsp;even&nbsp;though&nbsp;I&nbsp;am&nbsp;not&nbsp;using
&gt;the&nbsp;speex&nbsp;encoder/decoder.
&gt;
&gt;
&gt;
&gt;&gt;&nbsp;&gt;&gt;
&gt;&gt;&nbsp;&gt;&gt;&gt;&gt;&nbsp;So,&nbsp;in&nbsp;this&nbsp;scenario&nbsp;above,&nbsp;please&nbsp;recommend&nbsp;a&nbsp;good&nbsp;place&nbsp;to&nbsp;insert
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;speex_echo_playback&nbsp;and&nbsp;speex_echo_capture.&nbsp;Should&nbsp;I&nbsp;be&nbsp;just&nbsp;before&nbsp;the
&gt;&gt;&nbsp;read
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;and&nbsp;write&nbsp;to&nbsp;hardware?&nbsp;In&nbsp;that&nbsp;case,&nbsp;should&nbsp;I&nbsp;use&nbsp;a&nbsp;larger&nbsp;"frame&nbsp;size"
&gt;&gt;&nbsp;of
&gt;&gt;&nbsp;&gt;&gt;&gt;&nbsp;160&nbsp;samples&nbsp;x&nbsp;4?
&gt;&gt;&nbsp;&gt;&gt;
&gt;&gt;&nbsp;&gt;&gt;&nbsp;Of&nbsp;course&nbsp;you&nbsp;can&nbsp;set&nbsp;frame&nbsp;size&nbsp;to&nbsp;160*4.&nbsp;Otherwise&nbsp;you&nbsp;can&nbsp;feed
&gt;&gt;&nbsp;samples&nbsp;4&nbsp;times
&gt;&gt;&nbsp;&gt;&gt;&nbsp;to&nbsp;the&nbsp;AEC&nbsp;if&nbsp;you&nbsp;don't&nbsp;want&nbsp;to&nbsp;modify&nbsp;the&nbsp;frame&nbsp;size.
&gt;&gt;&nbsp;&gt;&gt;
&gt;&gt;&nbsp;&gt;&gt;&gt;
&gt;&gt;&nbsp;&gt;&gt;&nbsp;Thanks&nbsp;in&nbsp;advance,
&gt;&gt;&nbsp;&gt;&gt;&nbsp;Daniel.

</pre><pre></pre><br><br><span title="neteasefooter"><span id="netease_mail_footer"><hr/>
<a href="http://mail.163.com/html/110414_attachment/att1.htm" target="_blank">体验网易邮箱2G超大附件,轻松发优质大电影、大照片,提速3倍!</a>
</span></span>