[speex-dev] Memory leak in denoiser + a few questions
Steve Kann
stevek at stevek.com
Tue Mar 30 15:05:05 PST 2004
Jean-Marc Valin wrote:
>>Reverberation suppression?
>>
>>
>
>Basically, it means that if you are in a room with lots of echo (long
>decay), I can reduce it a bit.
>
>
>
>>I guess this would help reduce local source echoes? I've never
>>_noticed_ that to be a problem in my use, but I would imagine that
>>using a notebook's built-in microphone, you'd get some echo off of the
>>screen and stuff [also from the whole room]..
>>
>>Most of these echoes aren't so bad, but I guess they might make the
>>encoding job harder. I'd sure rather see the echo cancellation
>>finished [not that I have any say on what you work on!!!].
>>
>>
>
>Well, I'm still looking for help :)
>
>
>
>>Here's the numbers I got doing vad on 655 seconds of audio (about half
>>is speech, half is absolute silence [0's]).
>>P3-600: 25 seconds
>>Athlon XP 1700+ (1.45Ghz): 5 seconds
>>P4 2.8Ghz: 8.8 seconds.
>>
>>
>
>These numbers sound like a problem I has a while ago with the decoder.
>The VAD shouldn't take much CPU so I suspect there might be floating
>point underflows in some part, slowing down the Intel CPUs a lot (for
>some reason, the AMD CPUs seem to handle underflows faster).
>
>
Hmm, How can I find that out? How much CPU would you expect it to take?
I've been playing with oprofile, but I don't see it getting that finely
grained..
>>Anyway, I think I might need to find a less computationally intensive
>>VAD solution for the conference. VAD is currently only used when
>>people connect via the PSTN, so they presumably have a decent SNR, and
>>I may be able to get away with an energy envelope type of thing,
>>without needing frequency domain analysis. But before I go and start
>>coding this, is there any simple optimizations that can be done to the
>>preprocessor when it is being used only for the VAD decision?
>>
>>
>
>Have you tried using the (less accurate) VAD that's in the codec itself
>(SPEEX_SET_VAD)?
>
>
I'll take a look at that. In this case [in the conferencing
application], I'm not actually using speex encoding [these are PSTN
callers, I do VAD in clients when I control them], so I'd need to see if
I could rip it out of speex to use it.
Also, I do have a couple of patches to the preprocessor to send along
actually; basically this makes the start and continue probabilities
parameters that can be set by callers. We're currently using very low
probabilities; Much lower than your defaults, VAD_START=0.05
VAD_CONTINUE=0.02. We also have 20 frame (2/5 sec) "tail" that is
outside the preprocessor, which continues treating some frames as speech
after the detector has dropped out.
<p>Here's a patch:
<p>==================================================
<p>Diff for file preprocess.c, 1.2 -> 1.3
Index: preprocess.c
===================================================================
RCS file: /home/UniServ/dls/CVS/hms/app_conference/libspeex/preprocess.c,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -w -r1.2 -r1.3
--- preprocess.c 2003/11/07 23:40:23 1.2
+++ preprocess.c 2004/02/06 17:10:24 1.3
@@ -145,6 +145,9 @@
st->agc_level = 8000;
st->vad_enabled = 0;
+ st->speech_prob_start = SPEEX_PROB_START ;
+ st->speech_prob_continue = SPEEX_PROB_CONTINUE ;
+
st->frame = (float*)speex_alloc(2*N*sizeof(float));
st->ps = (float*)speex_alloc(N*sizeof(float));
st->gain2 = (float*)speex_alloc(N*sizeof(float));
@@ -435,12 +438,19 @@
st->speech_prob = p0/(1e-25+p1+p0);
/*fprintf (stderr, "%f %f %f ", tot_loudness, st->loudness2, st->speech_prob);*/
+ /* decide if frame is speech using speech probability settings */
+
/* if (st->speech_prob> .35 || (st->last_speech < 20 && st->speech_prob>.1)) */
- if (st->speech_prob> .20 || (st->last_speech < 20 && st->speech_prob>.05))
+ if (
+ st->speech_prob > st->speech_prob_start
+ || ( st->last_speech < 20 && st->speech_prob > st->speech_prob_continue )
+ )
{
is_speech = 1;
st->last_speech = 0;
- } else {
+ }
+ else
+ {
st->last_speech++;
if (st->last_speech<20)
is_speech = 1;
@@ -985,6 +995,30 @@
case SPEEX_PREPROCESS_GET_VAD:
(*(int*)ptr) = st->vad_enabled;
break;
+
+ case SPEEX_PREPROCESS_SET_PROB_START:
+ st->speech_prob_start = (*(float*)ptr) ;
+ if ( st->speech_prob_start > 1 )
+ st->speech_prob_start = st->speech_prob_start / 100 ;
+ if ( st->speech_prob_start > 1 || st->speech_prob_start < 0 )
+ st->speech_prob_start = SPEEX_PROB_START ;
+ break ;
+ case SPEEX_PREPROCESS_GET_PROB_START:
+ (*(float*)ptr) = st->speech_prob_start ;
+ break ;
+
+ case SPEEX_PREPROCESS_SET_PROB_CONTINUE:
+ st->speech_prob_continue = (*(float*)ptr) ;
+ if ( st->speech_prob_continue > 1 )
+ st->speech_prob_continue = st->speech_prob_continue / 100 ;
+ if ( st->speech_prob_continue > 1 || st->speech_prob_continue < 0 )
+ st->speech_prob_continue = SPEEX_PROB_CONTINUE ;
+ break ;
+ break ;
+ case SPEEX_PREPROCESS_GET_PROB_CONTINUE:
+ (*(float*)ptr) = st->speech_prob_continue ;
+ break ;
+
default:
speex_warning_int("Unknown speex_preprocess_ctl request: ", request);
return -1;
Diff for file speex_preprocess.h, 1.1 -> 1.2
Index: speex_preprocess.h
===================================================================
RCS file: /home/UniServ/dls/CVS/hms/app_conference/libspeex/speex_preprocess.h,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -w -r1.1 -r1.2
--- speex_preprocess.h 2003/11/06 21:57:59 1.1
+++ speex_preprocess.h 2004/02/06 17:10:24 1.2
@@ -49,6 +49,10 @@
float agc_level;
int vad_enabled;
+ // probabilities to check speech_prob against
+ float speech_prob_start ;
+ float speech_prob_continue ;
+
float *frame; /**< Processing frame (2*ps_size) */
float *ps; /**< Current power spectrum */
float *gain2; /**< Adjusted gains */
@@ -108,8 +112,9 @@
/** Used like the ioctl function to control the preprocessor parameters */
int speex_preprocess_ctl(SpeexPreprocessState *st, int request, void *ptr);
-
+#define SPEEX_PROB_START 0.35
+#define SPEEX_PROB_CONTINUE 0.1
#define SPEEX_PREPROCESS_SET_DENOISE 0
#define SPEEX_PREPROCESS_GET_DENOISE 1
@@ -122,6 +127,12 @@
#define SPEEX_PREPROCESS_SET_AGC_LEVEL 6
#define SPEEX_PREPROCESS_GET_AGC_LEVEL 7
+
+#define SPEEX_PREPROCESS_SET_PROB_START 8
+#define SPEEX_PREPROCESS_GET_PROB_START 9
+
+#define SPEEX_PREPROCESS_SET_PROB_CONTINUE 10
+#define SPEEX_PREPROCESS_GET_PROB_CONTINUE 11
#ifdef __cplusplus
==================================================
}
<p><p><p><p><p><p>> Jean-Marc
>
>
>
<p>--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Speex-dev
mailing list