[speex-dev] de-essing into speex?

Olav olav at bogus.net
Fri Dec 5 09:56:10 PST 2003



thanks for getting back to me,

i have uploaded a zip file containing some sound files that
demonstrates the issue.

  http://www.bogus.net/~olav/ess.zip

this contains

  s.mp3   original wav file (mono) converted to top-quality mp3 (370K)
  s.wma   windows media encoder with 19khz voice compression    ( 62K)
  s-2.spx speexenc --vbr --quality 2 on the wav file            ( 63K)
  s-9.spx --quality 9                                           (197K)

plus quality 3, 4, 5, 6, 7 and 8.

the contents of the file is a norweigan sentence from a record
containing a lot of ess sounds, repeated 10 times or so, just to get
some file size so file size comparison makes sense.

one may argue on which compression the ess sounds become
acceptable. after listening MANY times between the original and the
spx file, i decided that going under quality 9 means you start to hear
"computerish" ess sounds.

as for the speex VS windows media encoder issue, compare speex quality
2 with the wma file. they are equally sized and should therefore be of
equal quality, but in my ears the wma file is quite a lot better. it
may have less treble, but the spx file sounds very synthetic.

note: if i have used speexenc incorrectly please let me know.

the wav file was 2MB so i didn't want to include that, but simply use
lameenc etc to decode the mp3 file into wav if you want to do testing.

i hope to hear from you soon. i find this issue very interesting.

olav

> From: "Tony & Amanda Benik" <benikajal at mcihispeed.net>
> Date: Thu, 4 Dec 2003 23:47:39 -0600
> 
> Representative of Olav,
> 
> >like if you say "someone said the sun is shining", there is a lot of
> >ess sounds, and these will sound "computer-ish" at vbr qualities below
> >9.
> 
>   I don't mean to be rude but what bit rate is windows media encoder
> encoding at and what encoder (type) are you using...  Unless its low
> (32kbps-8kbps) it doesn't compare to speex (spx).  The "ess" sound
> you are hearing are most likely generated because the entire frame
> (bit of sound) has been striped of all but it most mathematically
> pure and simplest (smallest) representation.
> 
>   I know a bit about text2speech and speech2text, and though a de-ess
> filter on the speex decoder would be 'pleasant' to the human ear
> (if one finds pure tones unpleasant rather than unhuman).  It would
> make subsequent mixing and encoding of speex streams (VoIP phone
> lines) less effective and more costly in a resource sense.
> 
>   It is a good idea, though I would consider a luxury filter, that's
> just me being overly assertive.
> ||
> \/
> 
>   If anyone is interested from my knowledge of speech recognition all
> human phonemes when converted from power vs. time to power vs. freq
> exibit 2 characteristic spikes.  The primary spike defines the base
> for recognizing the phoneme and the next highest spikes relative
> location and power give a program a good probability match as to
> which phoneme it is.
> 
> Humanizing spx audio derived solely from pure human voices could be
> accomplished by reconstructing the secondary peak but would introduce
> a minimum latency far larger than several frame sizes (ie the length
> of a human phoneme i.e. vowel consonant).
> 
> The filter also will most likely foul up the speech alittle cause
> like most voice recognition software it can guess wrong an 
> reconstuct the wrong secondary peak onto the frames.  (I'm guessing)
> 
> The filter also will most likely eat up a lot of cpu power like most
> voice recognition software.  (I'm guessing)
> 
> ==
> 
> To conclude:
>   I may be very wrong so please correct me but I am dilligent to keep
>   up on these things.
> 
> -- Benikus Rex
> --- >8 ----
> List archives:  http://www.xiph.org/archives/
> Ogg project homepage: http://www.xiph.org/ogg/
> To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
> containing only the word 'unsubscribe' in the body.  No subject is needed.
> Unsubscribe messages sent to the list will be ignored/filtered.
> 

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Speex-dev mailing list