[speex-dev] de-essing into speex?

Olav olav at bogus.net
Sat Dec 6 10:25:18 PST 2003

> Date: Fri, 05 Dec 2003 13:22:53 -0500
> From: Jean-Marc Valin <Jean-Marc.Valin at USherbrooke.ca>
> I think I see what you mean, though I haven't been able to listen to
> your wma file (not everyone has a wma decoder). The problem probably
> only lies in the VBR tuning for wideband which hasn't received much work
> yet. One way to check that is to encode in constant bit-rate and see
> what the results are. I'm pretty sure you'll notice the problem appears
> only at (CBR) quality 5 or below.=20
> 	Jean-Marc

i have done further testing, and even at constant bitrates, wma is far
superior, from an "overall listening experience" point of view.

it seems that:

  speex maintains the crispness/treble of the recording, but with the
  cost of computer-ish background noise, like turning into russian
  radio stations on the am band, if you get my drift. ess sounds are
  particularily fragile to this.

  wma removes all high-freq/treble of the voice, and makes it "round"
  and dark, but there is no evidence of the computer-bleep bleep
  effects made by speex, the human voices sound like human voices,
  only blunter, in a way. ess sounds do not become embarassing.

to accomplish the same file size (or bit rate) with wma and spx, the
spx quality turns out so poor it is not usable (quality 2)

i tried to downsample my wav files from 44100 to 32000 to meet with
spx optimizations, but this did not help the situation really.

it would be fantastic if spx could be able to compress voice to the
extent wma does, but maintain the crispness and treble of the original
voice, as far as this is possible.

wma seems also less tolerant to music between voices than spx, which
is good. a voice encoder should not accept music at all, it should
just make garble of silence of it, to the extent that this is possible
to detect. wma seems to do this to a certain extent.

i am a programmer, but i do not know sound compression algorithms, so
i may be talking on wrong grounds, but i would just like to find the
best voice compresion program on earth, and just now i have to choose
between large/good spx files and small/blunt wma files.

it would be great if this could be improved in coming releases.


