[speex-dev] Frozen upper spectrum in WB VBR CNG
tgrand at canvaslink.com
Sat May 3 12:48:36 PDT 2003
Jean-Marc Valin (jean-marc.valin at hermes.usherb.ca) wrote:
> > I've been using Speex in my voice-over-IP program on Win32, in
> > wideband (16kHz) mode. I just starting using VBR recently and
> > have run into something that might be a problem within Speex:
> Are you turning on DTX in addition to VBR? Also, what version are you
> using. As of 1.0, DTX is no longer implied by VBR.
I'm using 1.0. DTX is definitely turned off. (I tried turning it on
and saw the bitrate go even lower, but the sound still sounds the
same as when DTX is off.)
> > If someone hasn't spoken for a little while, and the bitrate drops
> > to very low, sometimes the high half of the spectrum becomes frozen
> > with a looping sound. The bottom half of the spectrum is always
> > ok. (I have a linear spectral analysis view that makes it very easy
> > to see this behavior.) The frozen sound is usually much louder than
> > the background noise and sounds like a strange buzzing. It seems
> > like it could be a frame from recent speech activity. Occasionally
> > (like every few seconds) the sound will change to something
> > different, but still frozen. As soon as the speaker starts speaking
> > again, the problem goes away and doesn't come back until another
> > period of silence.
> The only way the spectrum can be "frozen" is when DTX is on and the VAD
> detects no speech. The idea is to reproduce the noise without
> transmitting any data. Can you send me some samples (original and .spx)
> so I can check what's happening?
Ok. I reproduced the problem using speexenc and speexdec (the Win32
binaries posted at http://www.speex.org/download.html) and have put
my sample input, the encoded .spx, and a .txt with commands used here:
You will hear some background noise from the room and me clicking the
mouse a few times. The first buzz is brought to a halt by the sound of
a soda can being opened. Then there is a second buzz later on. It
should be pretty noticible.
This is kind of a dumb example but I think it will demonstrate the
problem sufficiently. If you want, I can make an example involving
conversation to prove that this problem is troublesome during typical
> > Now, to describe the audio data I'm sending through the codec...
> > There's always some background noise but it's kept at reasonably
> > low levels, and is what you would expect to pick up with a mic in
> > a typical room with a computer in it. I do a high-pass filter to
> > reduce bass below around 400Hz before encoding. (Without the filter,
> > if there is significant bass background noise, VBR struggles and
> > the quality is very poor.)
> Strange, the wideband more is designed for the 50-7000 kHz band. Sure if
> you lower the bit-rate a lot, there may be problems with the bass
I think the bass was just too strong. My friend had his mic sitting
on his computer case... :) At first I wondered why his voice sounded
so strange. Then I had him turn on the filter on his end and it was
perfect after that.
> I think most VoIP applications so far have used CBR because it's more
Yes, I was pretty happy with CBR but it is kind of wasteful to be
transmitting at full bitrate during silence. So, I decided to at
least make VBR an option in my program, since Speex supports it.
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Speex-dev