[Speex-dev] General pre-processing prior to feeding sound to speex.

Wed Mar 2 20:49:26 PST 2005

Hi,

I have speex running as a part of a voice conferencing app. Well, one 
under development anyway.

I'm running VBR at quality 3 and get a "hissy-squelchy" background 
noise. This is fine, kinda, because the internal microphone in the 
laptop picks up hiss, the sound of the (actually very quiet) hard drive 
and generally speaking is of less than exemplary quality.

To help disguise this I've been implementing various 'amplifier' style 
sample tricks including:

* A pre-amplifier.
* A self adjusting pre-amp that aimed for a target RMS (did an 
astounding job of listening to the roadworks outside).
* A compressor.
* A compressor that "expanded" at the bottom end of the range.
* A noise floor implemented on a per-frame basis.

All, apart from the noise floor, were essentially useless. The noise 
floor had a kinda savage cut in/cut out thing going on. Understandably.

I've also noticed in the mailing list archives various references to 
speex 'training' itself to it's input - leading me to think that 
perhaps all of the above are wasted efforts because speex itself knows 
more about what it would like it's input to look like.

So I guess my main question is: what are the best ways to pre-process 
sound prior to feeding it to speex?

I was also wondering about implementing a hiss removal filter of some 
description. To my eyes this would involve fft'ing everything then 
removing any frequencies below a certain threshold. Is this likely to 
work? And if so, presumably speex fft's everything before going to work 
compressing for real - should I 'just' write some description of hook 
into speex itself?

I have some confusion about packing multiple frames into one packet, 
but that can be another thread...

Cheers,
Dave