[Speex-dev] Speech detection in preprocessor with echo
Jean-Marc Valin
Jean-Marc.Valin at USherbrooke.ca
Wed Jun 22 01:21:59 PDT 2005
Just curious, why are you freezing agc_gain instead of freezing
st->loudness2 ?
Jean-Marc
Le lundi 20 juin 2005 à 14:40 -0400, Tom Grandgent a écrit :
> I think you'll have to modify Speex to get the functionality you're
> looking for. I've made a few simple modifications to the AGC to prevent
> it from 1) exceeding a specified level of amplification and 2) enable
> and disable adaptation, so I can freeze it at a certain level while
> speech is not detected. It's mostly just a matter of doing this at the
> end of speex_compute_agc():
>
> if (!st->agc_frozen)
> {
> agc_gain = st->agc_level/st->loudness2;
> /*fprintf (stderr, "%f %f %f %f\n", active_bands, st->loudness, st->loudness2, agc_gain);*/
> if (agc_gain>st->agc_max_gain) /* was 200 */
> agc_gain = st->agc_max_gain; /* was 200*/
> }
> else
> agc_gain = st->agc_gain;
> st->agc_gain = agc_gain;
>
> and adding a few items to speex_preprocess_ctl() and the state struct.
> (I control these things at the application level.. you may wish to
> control them from within the preprocessor if you're using the
> preprocessor's VAD.)
>
> Anyway, if you can figure out what's going on with the variables you
> named, I'm sure you can make the necessary modifications to do what
> you've asked for. I think the preprocessor in general needs a little
> tweaking like this to work well in various real-world situations, but
> I'm not sure how much of this Jean-Marc wants to incorporate into
> Speex vs. leave to application developers.
>
> Tom
>
> Thorvald Natvig <speex at natvig.com> wrote:
> >
> >
> > Echo cancellation works like a charm, but it seems to confuse the
> > preprocessor a bit.
> >
> > If listening to background music (properly fed through the echo
> > cancellator), the music is removed but the result is still detected as
> > speech even if almost silence remains in the signal.
> >
> > Also, the AGC keeps adjusting to the minute remains in the signal, meaning
> > that sooner or later it will amplify the remains enough that it's clearly
> > audible on the other side. If I cough or say a word, the AGC readjusts and
> > all is fine.
> >
> > Looking at the members of the speex_preprocess structure, I see that
> > during these long periods of "silence" (only the background music or
> > only the other end talking while I shut up):
> >
> > - Zlast (which looks like a SNR variable) is at 0.05-0.2, but jumps up
> > above 1.0 if I actually say something.
> > - loudness2 keeps decreasing from the "normal" of ~6000 to 1000 or so, at
> > which point the residual echo is amplified enough that it's clearly
> > audible at the other end. If I say something, it adjusts.
> > - speech_prob is at 0.999 or 1.000 as long as the other end talks.
> >
> > This is all with up-to-date SVN version of speex, and in a fairly noisy
> > environment (it's hot, so I have the window open, so passing cars on the
> > nearby road are quite audible, as is my air cleaner).
> >
> > Is there something I can do to tune this away, a way to tell the AGC to
> > never go that low, and a way to tell the speech detector that echo remains
> > are not speech?
> >
> > _______________________________________________
> > Speex-dev mailing list
> > Speex-dev at xiph.org
> > http://lists.xiph.org/mailman/listinfo/speex-dev
>
> _______________________________________________
> Speex-dev mailing list
> Speex-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev
More information about the Speex-dev
mailing list