[Speex-dev] Speech detection in preprocessor with echo
Thorvald Natvig
speex at natvig.com
Mon Jun 20 10:04:39 PDT 2005
Echo cancellation works like a charm, but it seems to confuse the
preprocessor a bit.
If listening to background music (properly fed through the echo
cancellator), the music is removed but the result is still detected as
speech even if almost silence remains in the signal.
Also, the AGC keeps adjusting to the minute remains in the signal, meaning
that sooner or later it will amplify the remains enough that it's clearly
audible on the other side. If I cough or say a word, the AGC readjusts and
all is fine.
Looking at the members of the speex_preprocess structure, I see that
during these long periods of "silence" (only the background music or
only the other end talking while I shut up):
- Zlast (which looks like a SNR variable) is at 0.05-0.2, but jumps up
above 1.0 if I actually say something.
- loudness2 keeps decreasing from the "normal" of ~6000 to 1000 or so, at
which point the residual echo is amplified enough that it's clearly
audible at the other end. If I say something, it adjusts.
- speech_prob is at 0.999 or 1.000 as long as the other end talks.
This is all with up-to-date SVN version of speex, and in a fairly noisy
environment (it's hot, so I have the window open, so passing cars on the
nearby road are quite audible, as is my air cleaner).
Is there something I can do to tune this away, a way to tell the AGC to
never go that low, and a way to tell the speech detector that echo remains
are not speech?
More information about the Speex-dev
mailing list