[Speex-dev] Speech detection in preprocessor with echo

Wed Jun 22 01:21:59 PDT 2005

Just curious, why are you freezing agc_gain instead of freezing
st->loudness2 ?

Jean-Marc

Le lundi 20 juin 2005 à 14:40 -0400, Tom Grandgent a écrit : 
> I think you'll have to modify Speex to get the functionality you're 
> looking for.  I've made a few simple modifications to the AGC to prevent 
> it from 1) exceeding a specified level of amplification and 2) enable 
> and disable adaptation, so I can freeze it at a certain level while 
> speech is not detected.  It's mostly just a matter of doing this at the 
> end of speex_compute_agc():
> 
>    if (!st->agc_frozen)
>    {
> 	   agc_gain = st->agc_level/st->loudness2;
> 	   /*fprintf (stderr, "%f %f %f %f\n", active_bands, st->loudness, st->loudness2, agc_gain);*/
> 	   if (agc_gain>st->agc_max_gain)	/* was 200 */
> 		   agc_gain = st->agc_max_gain;	/* was 200*/
>    }
>    else
> 	   agc_gain = st->agc_gain;
>    st->agc_gain = agc_gain;
> 
> and adding a few items to speex_preprocess_ctl() and the state struct.  
> (I control these things at the application level.. you may wish to 
> control them from within the preprocessor if you're using the 
> preprocessor's VAD.)
> 
> Anyway, if you can figure out what's going on with the variables you 
> named, I'm sure you can make the necessary modifications to do what 
> you've asked for.  I think the preprocessor in general needs a little 
> tweaking like this to work well in various real-world situations, but 
> I'm not sure how much of this Jean-Marc wants to incorporate into 
> Speex vs. leave to application developers.
> 
> Tom
> 
> Thorvald Natvig <speex at natvig.com> wrote:
> > 
> > 
> > Echo cancellation works like a charm, but it seems to confuse the 
> > preprocessor a bit.
> > 
> > If listening to background music (properly fed through the echo 
> > cancellator), the music is removed but the result is still detected as 
> > speech even if almost silence remains in the signal.
> > 
> > Also, the AGC keeps adjusting to the minute remains in the signal, meaning 
> > that sooner or later it will amplify the remains enough that it's clearly 
> > audible on the other side. If I cough or say a word, the AGC readjusts and 
> > all is fine.
> > 
> > Looking at the members of the speex_preprocess structure, I see that 
> > during these long periods of "silence" (only the background music or 
> > only the other end talking while I shut up):
> > 
> > - Zlast (which looks like a SNR variable) is at 0.05-0.2, but jumps up
> >    above 1.0 if I actually say something.
> > - loudness2 keeps decreasing from the "normal" of ~6000 to 1000 or so, at
> >    which point the residual echo is amplified enough that it's clearly
> >    audible at the other end. If I say something, it adjusts.
> > - speech_prob is at 0.999 or 1.000 as long as the other end talks.
> > 
> > This is all with up-to-date SVN version of speex, and in a fairly noisy 
> > environment (it's hot, so I have the window open, so passing cars on the 
> > nearby road are quite audible, as is my air cleaner).
> > 
> > Is there something I can do to tune this away, a way to tell the AGC to 
> > never go that low, and a way to tell the speech detector that echo remains 
> > are not speech?
> > 
> > _______________________________________________
> > Speex-dev mailing list
> > Speex-dev at xiph.org
> > http://lists.xiph.org/mailman/listinfo/speex-dev
> 
> _______________________________________________
> Speex-dev mailing list
> Speex-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev