[Speex-dev] Speech switching in speakerphone?

Mon Jun 22 00:02:43 PDT 2009

Hi Jean-Marc

> Can you explain what you mean here by "speech switching" 

By speech switching I mean the adaption of "gain2" when near-end or far-end is talking. What is important is that the timing is good and that the gain is low/high while far/near-end is talking. By timing I mean that the "gain2" should remain low until all far-end talk is final and that the gain should quickly be high when near-and is talking.

> There's also a parameter to control the maximum amount of suppression
> allowed:
> SPEEX_PREPROCESS_SET_NOISE_SUPPRESS : noise suppression
> SPEEX_PREPROCESS_SET_ECHO_SUPPRESS : echo suppression when there is no
> local talk
> SPEEX_PREPROCESS_SET_ECHO_SUPPRESS_ACTIVE: echo suppression in double-talk

Yes, I am aware of these parameters and are familiar with how they affect the gain. However they do not affect the timing of speech switching very much. 

The important parameter for the speech switching is the Pframe. Pframe is as you know based on the SNR estimation. However when the near-end signal is low compared to the far-end signal (coming from the close speaker element) the SNR is not distinctly increased when near-end talks.

> Can you explain what ... problem you've encountered?

Our main problem is that it is hard to have good reliance on a high "gain2" when near-end is talking, resulting in missing conversation in one direction. Some improvement can be made by modifying the Qcurve function but it is very sensitive.

A secondary problem we also have is that the residual echo during decay of far-end talk is not suppressed very well. This is probably caused by the strong echo coupling plus a fairly reverberant room. We have been able to solve this by adding a weighting factor and some accumulation on the residual_echo and echo_noise. This modification works perfect on the far-end-problem but worsen the main problem even more. 

Best Regards
Johan