[Vorbis-dev] Bounty for function to monitor an OGG-VORBIS stream

Carsten Haese carsten at uniqsys.com
Sun Sep 9 17:12:58 PDT 2007


On Tue, 2007-08-28 at 14:07 -0400, xiphmont at xiph.org wrote:
> On 8/24/07, Paul Martin <pm at nowster.org.uk> wrote:
> 
> > IIRC, Vorbis does encode noise (as in hiss) separately from tones
> > (fundamental, harmonics), so half the battle is won for you already.
> 
> No; vorbis does some implicit tonal estimation, but does not
> explicitly seperately handle tones/noise.  This is a fundamental
> limitation of the current generation of codecs we seek to improve in
> Ghost.

FYI, I'm the lucky one that got picked to tackle this task, and while my
initial thought was to try to glean something from the encoded floor
information, I quickly discovered that that is way too deep under the
hood for my taste.

My solution simply decodes the stream to PCM and re-analyzes the PCM to
look for a "flattish" spectrum, despite Paul's warning that this may not
yield good results due to an artificially flattened spectrum. I think in
reality it's the dynamic range that's compressed, not the spectrum
itself, or at least the spectrum wasn't flattened beyond recognition.

Music and speech will always have relatively few fundamental frequencies
that have significantly higher amplitude than other frequencies, whereas
in noise the amplitude is smeared across many frequencies. My solution
simply counts the number of frequencies that have above-average
amplitude, and if that count is above a certain magical number, we have
static. This approach may be naive, and it probably has room for
refinement, but it worked reliably enough with the test material I was
given.

-Carsten




More information about the Vorbis-dev mailing list