[vorbis-dev] vorbis-utils features ;)

Wed Mar 21 19:13:29 PST 2001

At 4:35 pm -0800 21/3/01, Monty wrote:
>  > This is a good idea in abstract. In practice, you want to define a
>>  noise floor and truncate below that.
>
>No, you're only considering a single case, that of low level hiss.
>That's not the only case where you have [audible] audio content under
>-90dB. 
>
>>  The 90dB range of 16-bit audio
>>  is overkill in all real-world playback environments, and preserving
>>  noise in the low 4 bits is just going to make it impossible to hit
>>  low bitrate targets.
>
>It's 96dB, and that's only using the most simpleminded measure of
>dynamic range.  In fact, 16 bits has closer to 130dB of usable signal
>range (look at the frequency plots).  I'm happy to demostrate if you
>don't believe me.
>
>Setting a hard limit that throws away potentially 1/3 of the dymanic
>range is not really a good idea.

 From a pure signal processing point of view, I defer to you.

 From a real world fitting music through narrow pipes point of view, 
unless you do intelligent noise gating, you are never going to hit 
modem bitrates. If I'm listening to music on a computer that isn't a 
Cube or iMac, the fan and drives are going to raise the ambient noise 
so high that having 50dB of real world signal to noise is doing very 
well. (Ambient computer noise is about 40-50 dBA; running the amp 
above 95-100 dBA hurts my ears).

The quantisers used in the frequency domain obviously have a big role 
to play here, but making pre-filtering a necessary separate stage is 
going to disappoint.

Another great test case, BTW, are the Sony Glenn Gould editions of 
his Bach pieces - you can hear the piano, and faintly in the 
background Glenn humming to himself. Opinions vary on whether this 
humming is annoying and should be removed, or a key part of the 
performance.

At 4:36 pm -0800 21/3/01, Monty wrote:
>  > Conversely, if it is random, you don't have to model it precisely,
>>  just use an RNG and scale down. Or, even better, filter it out with a
>>  noise gate before moving to frequency space.
>>
>>  Non-random noise (like the NASA ionosphere sample I posted earlier)
>>  is trickier, especially for a frequency-domain codec.
>
>You've hit the nail on the head.  How do you distinguish the two?
>It's an NP-hard problem [at least] in the general case.  If it wasn't,
>compression would be easy.

Two approaches come to mind.

1. Define a set of presets for filtering that match certain kinds of 
music, and let the user choose (Over-produced pop that has a dynamic 
range of <10dB being an obvious class, classical music another, 
speech another).

2. Look for obvious kinds of noise in the frequency domain (linear 
ramps below a certain level?) Don't encode these frequencies 
directly, but generate shaped noise and encode this as a separate 
component. This is not trying to solve the hard problem, but get the 
data into a better shape for your compression algorithm before 
calling it.

There is a huge industry of hardware products that do this kind of 
pre-filtering for MPEG video.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.