[Vorbis-dev] Getting masked FFT data out of libvorbisenc

xiphmont at xiph.org xiphmont at xiph.org
Mon Apr 23 15:41:47 PDT 2007


On 4/23/07, Steinar H. Gunderson <sgunderson at bigfoot.com> wrote:

> Thanks for the answer -- it looks like it could be an interesting metric (at
> least research-wise) after that the modification. Correct me if I'm wrong
> here, though: Can I expect this curve to be reasonable similar (in the sense
> that the original signal is also similar) for the PCM original and something
> that has already been through lossy compression once? (Just using the masking
> curve, without the ATH curve removed, seems to give better values than pure
> MFCC for most cases _except_ the case where it's comparing a FLAC original
> and an MP3/Vorbis compressed version, which seems a bit odd to me.)

If you are going by tone masking alone, compression should not affect
it much.  If noise 'masking' is part of the curve, then compression
can affect it somewhat more.  Tone masking is simply a union of curve
maximums, where noise masking is using least-squares and a change to
RMS noise after a encode/decode cycle will affect it directly.  Modern
Vorbis, at least, tries to preserve RMS noise energy by band.  Most
MP3 encoders do not.

Monty


More information about the Vorbis-dev mailing list