[Speex-dev] Leaking audio and AGC/VAD

Tom Grandgent tgrand at canvaslink.com
Fri Feb 3 07:55:42 PST 2006


Hi,

The leakage problem you describe is very, very common and you will need 
to do something to address it.  I modified the version of Speex I use to 
implement an adjustable max gain.  If you look at speex_compute_agc in 
preprocess.c, you will see:

   if (agc_gain>200)
      agc_gain = 200;

This max of 200 is usually more than enough to amplify leakage which 
occurs either in the sound hardware or in a headset to the point where 
it is as loud as normal speech.

You can replace this max of 200 with a configurable max gain, which is 
what I've done.  But then you lose the 'zero user configuration'.  And 
getting people to set a correct max gain can be problematic.

It might be better to instead adjust the adaptation rate (?) of the AGC.  
For example, by default, it will happily jump between 0.9x and 50x given 
the right input signal.  And this is desirable for some situations, such 
as people talking in a meeting room where one person is shouting right 
next to the mic and another person is speaking normally from across the 
table.  But it's not desirable when the high amplification is picking up 
leakage or background sounds like typing.  I don't know how to adjust 
the Speex AGC's adaptation rate, or if it's practical to do so.

Zero (or at least minimal) user configuration AGC that covers all common 
cases is a difficult but important problem to solve.  In my opinion, 
Speex AGC is a nice starting point but you have to do some work to get 
the results you're looking for.

Of course, AGC is only part of the picture when you're using VAD as well. 
But, in my experience, no VAD system can reliably detect desirable sound 
when AGC is over-amplifying undesirable sound.

Tom

njt at home.se wrote:
> 
> Hi,
> I am working on a VOIP implementation were one of the key design goals
> is zero user configuration. Similar to Skype.
> What I've come to notice is that my soundcard (NForce4 based) leaks
> audio from the playback path to the recording path. 
> This is probably not unique for my hardware at all and will happen on
> some percentage of all users of my software.
> (All of this is Win32)
> 
> What happens is that when no one is talking I think Speex pre-processor
> starts to think that the very quiet leak from my playback path is
> actually someone talking.
> I am using AGC and VAD and my guess would be that AGC amplifies the leak
> to such a level that VAD starts triggering.
> This is rather annoying for the user if the software is used while
> playing MP3s (loud) or playing computer games.
> 
> Would it be possible to implement some sort of a threshold to limit what
> Speex classifies as "usable sound" or something?
> 
> I would also like to thank the entire Speex effort for this great piece
> of software and especially Jean-Marc Valin
> 
> //Regards Jonas Tärnström, Sweden.
> 
> 
> _______________________________________________
> Speex-dev mailing list
> Speex-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev
> 
> 



More information about the Speex-dev mailing list