[Speex-dev] Speech detection in preprocessor with echo

Mon Jun 20 10:04:39 PDT 2005

Echo cancellation works like a charm, but it seems to confuse the 
preprocessor a bit.

If listening to background music (properly fed through the echo 
cancellator), the music is removed but the result is still detected as 
speech even if almost silence remains in the signal.

Also, the AGC keeps adjusting to the minute remains in the signal, meaning 
that sooner or later it will amplify the remains enough that it's clearly 
audible on the other side. If I cough or say a word, the AGC readjusts and 
all is fine.

Looking at the members of the speex_preprocess structure, I see that 
during these long periods of "silence" (only the background music or 
only the other end talking while I shut up):

- Zlast (which looks like a SNR variable) is at 0.05-0.2, but jumps up
   above 1.0 if I actually say something.
- loudness2 keeps decreasing from the "normal" of ~6000 to 1000 or so, at
   which point the residual echo is amplified enough that it's clearly
   audible at the other end. If I say something, it adjusts.
- speech_prob is at 0.999 or 1.000 as long as the other end talks.

This is all with up-to-date SVN version of speex, and in a fairly noisy 
environment (it's hot, so I have the window open, so passing cars on the 
nearby road are quite audible, as is my air cleaner).

Is there something I can do to tune this away, a way to tell the AGC to 
never go that low, and a way to tell the speech detector that echo remains 
are not speech?