[vorbis] make lo-fi sound as good as RealAudio?

B.H. Anderson bhafool1 at hotmail.com
Mon Jan 13 05:16:53 PST 2003



First, there is a problem with the original 44 kHz stereo WAV file, 
mentioned previously, which I'll expand upon:

>From about 27 seconds in, when it gets louder (Chorus), there's a sound of 
pretty severe clipping distortion, which is confirmed on CoolEdit (in 
Windows) as bottoming out of the waveforms at an amplitude of -32768 (full 
scale). There's also some at 17-19 seconds on the word "I'll" and the 
syllable "Out" (from word "Outside"), though it's not quite so severe.

Clipping on the original WAV (for more than a couple of samples in 
succession, at least) plays havoc with the frequency content and can really 
sound bad when encoded using lossy compression because it creates louder 
content in high frequency components than is typical of music, and can suck 
away bits that should be encoding the music to encode the distortion 
artifacts instead.

The quieter first part of the recording is barely within the clipping 
limits, but just survives. I presume that if this was digitally ripped from 
CD, a scaling factor has been applied (which is dangerous with modern 
over-loud, amplitude-compressed mastering that usually peaks the CDs at full 
scale).
Otherwise it's been ripped via the soundcard (analog rip) with the CD volume 
up fairly high. In that case, reduce the CD volume on the recording mixer 
volume control or find out how to do digital audio extraction direct from 
the CD (which is usually faster than 1x, even with the best tools like Exact 
Audio Copy for Windows)

<p>Anyway, knowing now that your RealAudio is mono and that you want to cater 
for modem streaming at no more than about 28 kbps, command lines such as 
this are suitable:

oggenc -M 28 --downmix --resample 16000 Kasey.wav

(this came to 23.9 kb/s)

I think this is the best. Most artifacts are suppressed but the sparkle of 
the top-end of the frequency response is absent due to the 8 kHz maximum 
frequency. (I'm using headphones, so I won't tolerate artifacts as much as I 
might with loudspeakers)

<p>If you're prepared to accept smushing of the vocals more to the extent of 
RealAudio, try a higher bandwidth to get more high-frequencies back (11 kHz 
for 22050, 16 kHz for 32000, the latter being a wideband as FM radio).

oggenc -M 28 --downmix --resample 22050 Kasey.wav
(this was 27.1 kb/s)

oggenc -M 28 --downmix --resample 32000 Kasey.wav
(This was 25.3 kb/s)

If you want to allow the full bandwidth, you'll find that 32 kbps is the 
nominal bandwidth of the 44100 sampling rate, mono at Vorbis q -1.00, so the 
-M function can't be used without a -q setting.

oggenc - M 32 -q -1.0 --downmix Kasey.wav
(actually gives 35.8 kb/s, so -M 32 wasn't able to provide a hard limit)

This is a considerably bigger file than the previous ones, and sounds nicer 
(thanks to the extra bits encoding it).

If you actually want stereo, I'd suggest something like

oggenc -q 0 --resample 16000 Kasey.wav
(this was 30.7 kb/s)

Regarding compatibility, to ensure your listeners don't have pitch-shifting 
(like a slow record or tape) use --resample rates from the following list 
only:

8000 11025 16000 22025 32000 44100 48000

(for some reason even 12000 and 24000 don't work at the right speed/pitch on 
my office PC, even though they're submultiples of 48000)

<p>Also, I notice that a certain harshness (which feels like reduction in 
smooth bass) is introduced when I resample below 44100, especially below 
32000, even on quality 10. I suspect this is due to my soundcard not 
adjusting the bandwidth, so you get staircasing. Oversampling with 
post-filtering (e.g. a very short FIR filter) would remove this (in fact I 
use just such a homemade filter, but in 2 dimensions, when upsampling in 
photo editing to remove the aliasing 'jaggies' at spatial frequencies beyond 
those of the original resolution). Unfortunately, WinAmp software won't 
upsample for me to bypass the shortcomings of my soundcard, if that's what 
it is (and such a software solution could also let me play music at 24000 
samples/second properly). Upsampling and post-filtering at the same stage as 
adding decoding dither would be ideal options for a high-quality plugin to 
allow (and the Ogg Vorbis plugin already offers dither and ReplayGain, so 
it's high-end).

Regards,

BHA

_________________________________________________________________
MSN 8 with e-mail virus protection service: 2 months FREE* 
http://join.msn.com/?page=features/virus

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis mailing list