[Speex-dev] Testing Help

Jean-Marc Valin jean-marc.valin at usherbrooke.ca
Sun Nov 25 05:58:32 PST 2007

> While MOS is based on the subjective evaluation of a number of people,
> the final result can be said to be reasonably objective. 

Well, it's subjective in the sense that what is measured is something
subjective (quality). Of course, it's better to have a good subjective
measurement (MOS) than a bad objective one (SNR, PESQ, ...)

> Provided the
> people overseeing the tests are non-partisan, a MOS test is quite
> repeatable. Many measurements considered objective are statistically
> based, and a properly conducted MOS test is such a test.

True. Although one really annoying thing with MOS is that it takes huge
number of participants in order to get anything meaningful. Recently,
I've been using MUSHRA and was able to get significant results with only
a dozen listeners.

> PESQ is just something to keep dumb managers of the "if I can't measure
> something meaningful, just give me any old numbers I can put in reports"
> school happy.

True. Although PESQ can also be useful as a sort of "delta" measure.
i.e. if you start from a known point and you add some perturbation (e.g.
packet loss), it gives you an idea of how much you degraded. For
comparing very different inputs, I agree that it's crap. As an example,
I once had a case where the compressed version of file A had a lower
PESQ score (using A as reference) then what I was getting comparing A to
the original of an unrelated file B!


