SV: [speex-dev] Speex modes

Sun Oct 13 03:18:04 PDT 2002

Pontus Carlsson wrote:

>Thanks!
>
>Btw, have you tried using SBR-technology or similar with speech codecs? That
>might be a good idea I thought.. But I don't know if it produces as good
>quality with speech codecs as it does for music codecs. Do you know if there
>is any open source variant of SBR?
>
SBR exploits a limitation of your ears. At high frequencies (like over 
10kHz) you cannot determine pitch with any accuracy. You hear up to 
15kHz to 20kHz (depending on age and other factors), but you really 
cannot identify pitch at these frequencies. You cannot even determine if 
content above about 10kHz is properly harmonically related to the lower 
pitched fundamentals which usually give rise to them.

I don't know of any voice specific coder that even attempts to capture 
energy above 10kHz. SBR just isn't relevent. Most wideband speech coding 
captures only 7kHz to 8kHz bandwidth. The key improvement that gives 
over the 3kHz to 4kHz most mainstream voice coders capture is to clean 
up unvoiced sounds. fffff, sssss, and other unvoiced sounds appear 
almost the same at telephone bandwidth. At 7kHz bandwidth they have 
enough character to make them more distinguishable. The basic 
intelligibility improvement you get is usually small. However, the voice 
is rather more pleasant and less tiring to listen to. That brings 
considerable intelligibility improvements in a long discussion. Adding 
energy up to the limit of hearing adds more to the pleasantness of the 
voice, but it isn't usually considered enough to get people excited 
about commiting extra bits per second to it.

Regards,
Steve

<p><p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.