<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Jean-Marc Valin wrote:

<blockquote cite="mid:463A7817.7070705@usherbrooke.ca" type="cite">

  <blockquote type="cite">

    <pre wrap="">   As you can tell, the AAGC integration with speex was really a classic

hack.  Instead of re-creating the hack, what's probably best here is to

integrate AAGC back into speex, and have a proper API.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

Agreed here. If you can come up with a clean patch to add that feature,

it's something I'd like to see in Speex.

  </pre>

</blockquote>

<br>

I hate to be a talker and not a do-er, but I won't be able to write

this myself, probably someone on the iaxclient team could do it.<br>

<br>

<blockquote cite="mid:463A7817.7070705@usherbrooke.ca" type="cite">

  <blockquote type="cite">

    <pre wrap="">   For those of you just tuning in, what I call "AAGC" is an AGC

implementation where analog gains are manipulated instead, or in

addition to the AGC within speex (where levels are normalized via

multiplication).  The benefits of AAGC are:  (1) (most important),

reducing the analog gain can prevent clipping, which can't be done with

speex' current AGC, and (2) when raising levels, you get better quality

by raising the mixer levels, as opposed to just multiplying.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

It's a good thing to do, but you need to be really careful when doing

that because:

1) Any change in the analogue gain automatically de-adapts the echo

canceller so you only want to do that when really necessary (e.g.

clipping screws up the EC anyway)

2) The processing chain goes "AEC -&gt; noise suppressor -&gt; AGC", but for

the analog gain, you really want to measure the signal that goes into

the echo cancellation, not at the AGC. Otherwise, you risk increasing

the analog gain to a level that creates clipping before the AEC (even if

the signal at the AGC is lower.

Hmm, or does that mean the analogue AGC is actually completely

independent from the "real" AGC. Any thoughts?

  </pre>

</blockquote>

<br>

It's actually a bit more complicated, because it's more like "AEC -&gt;

Noise Suppressor -&gt; VAD -&gt; AGC", even if the VAD decision isn't

used by the consumer, right.&nbsp; Because the VAD decision needs to be used

by AGC, so that it isn't raising the gain of background noise (although

it should probably lower the gain when there's any signal higher than

it's threshold).<br>

<br>

For AAGC, though, I guess one way to do this would be if you could

somehow "transport" the un-cancelled, un-noise-suppressed energy level

past the VAD decision, and then used that to determine what gain

adjustments to make.&nbsp; In this fashion, you'd be making your adjustments

based on the information you want:&nbsp; (a) the actual signal energy before

processing, and (b) VAD decision.&nbsp; <br>

<br>

You might be able to fake it good enough by putting AAGC before AEC,

and using the VAD decison from frame "n-1" when you're processing frame

"n".&nbsp; You'll probably have enough hysteresis and a bit of history in

the decision making process anyway that it might not matter.<br>

<br>

As far as gain changes messing up the rest of the preprocessing chain:&nbsp;

It would seem to mess up the denoiser, the VAD logic, etc., as well as

the echo canceller.&nbsp; It might be possible (as I wrote earlier) to give

the filter chain some hints about what the effects of the changes are,

but it probably won't be perfect, because it would be difficult or

impossible to predict the exact response of gain adjustments, and the

delay after which they will actually take effect.<br>

<br>

The AAGC mechanism I implemented, though, was good enough, for some

measure of good enough.&nbsp; It basically made step-wise adjustments (10%

or 20%) every so often, when speex' loudness parameter was above or

below certain thresholds, and it strongly detected speech.&nbsp; If you use

this mechanism, and pre-set the mixers to be at about 80%, it

relatively quickly gets the gain into a reasonable place once speech is

detected.&nbsp; It would probably work just as well when EC is involved, as

long as EC and VAD work together well enough such that you don't get

VAD false-positives from echo.&nbsp; The target "loudness" range here is

4000 &lt;-&gt; 8000, but it could be widened a bit to avoid more

adjustments.<br>

<br>

&lt;snip&gt;<br>

&nbsp; /* Analog AGC: Bring speex AGC gain out to mixer, with lots of

hysteresis */<br>

&nbsp;&nbsp;&nbsp; /* use a higher continuation threshold for AAGC than for VAD itself

*/<br>

&nbsp;&nbsp;&nbsp; if(!silent &amp;&amp;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (iaxc_silence_threshold != 0) &amp;&amp;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (iaxc_filters &amp; IAXC_FILTER_AGC) &amp;&amp;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (iaxc_filters &amp; IAXC_FILTER_AAGC) &amp;&amp;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (st-&gt;speech_prob &gt; .20)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ) {<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; static int i;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; double level;<br>

<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i++;<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if((i&amp;0x3f) == 0) {<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; float loudness = st-&gt;loudness2;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if((loudness &gt; 8000) || (loudness &lt; 4000)) {<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; level =&nbsp; iaxc_input_level_get();<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* fprintf(stderr, "loudness = %f, level = %f\n", loudness,

level); */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* lower quickly if we're really too hot */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if((loudness &gt; 16000) &amp;&amp; (level &gt; 0.5)) {<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* fprintf(stderr, "lowering quickly level\n"); */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iaxc_input_level_set(level - 0.2);<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* lower less quickly if we're a bit too hot */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else if((loudness &gt; 8000) &amp;&amp; (level &gt;=

0.15)) {<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* fprintf(stderr, "lowering slowly level\n"); */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iaxc_input_level_set(level - 0.1);<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* raise slowly if we're cold */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else if((loudness &lt; 4000) &amp;&amp; (level &lt;= 0.9))

{<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* fprintf(stderr, "raising level\n"); */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; iaxc_input_level_set(level + 0.1);<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

&nbsp;&nbsp;&nbsp; }<br>

<br>

&lt;/snip&gt;<br>

<br>

<br>

<br>

<blockquote cite="mid:463A7817.7070705@usherbrooke.ca" type="cite">

  <pre wrap="">

  </pre>

  <blockquote type="cite">

    <pre wrap="">(1) is really the most important reason.

   Now, the API I'd envision for this would be one where you could tell

speex that you would like to use AAGC, and then register some callbacks

that speex_preprocess() could call to query or set the input or mixer

level.  Further, a more intellegent implementation within speex could

consider the requested changes in the rest of the preprocessor chain

(i.e. it would know that if it asked for a 3dB increase in input gain,

to expect that input levels would rise by 3dB within a few frames).  The

hacky implementation I did inside of iaxclient gave speex no such

information.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

This is probably things we'll want to consider one we decide on where to

put the AAGC in the first place.

        Jean-Marc

  </pre>

</blockquote>

<br>

</body>

</html>