[theora] A comparison of VP3, and two MPEG-4 variants

Marco Al marco at simplex.nl
Mon Mar 24 09:42:44 PST 2003

From: "Freun Laven" <FreunLaven at earthlink.net>

> >I think you would be much better off relying on subjective
> >measurements rather than PSNR.
> Considering the incredible vaguness in what's considered "good enough",
> any decent testing method is going to *have* to do some sort of
> objective, reproducable measurments.  (Unless, of course, people are
> going to be satisified with some group of 'experts' making declarations
> of what is 'best'.)

Not from a group of experts, but a group of layman yes. Experts can have
preconceptions based on objective measures and can tie them to specific
codecs by recognising specific artifacts.

MOS is the benchmark to which all objective measures are compared. To any
individual his subjective measure is the only one which counts ... how then
can you look at the big picture and declare subjective measures meaningless?
Obviously the average subjective impression is the only measure which has
any meaning at all ...

> With video it's even worse.  And for somebody like myself, who has
> eyesight problems, what I would consider to be 'good' would probably be
> laughed at by others, simply because I have trouble detecting the subtle
> differences.

That is a rather extreme example, on average over all potential users these
kind of things even out. Although since with subjective tests you usually
have a rather small group your opinion would indeed probably not be usefull
to include :/

> A purely subjective comparison is worthless.

Actually it is the only comparison of value :) Indeed, the value of
objective measures themselves is measured by how well they correlate with
subjective scores.

On a related matter, I dont quite see the relevance of constant quantizer
measurements ... they are usefull as micro benchmarks during codec
development to compare a codec against its previous version, but does anyone
actually use constant quantizer encoding in practice? If not how are the
results relevant for comparing codecs against eachother?

Id find the results more relevant if the codecs were compared as they would
be used. Which means seperate tests for streaming (CBR/ABR) and storage
applications (VBR/2-pass encoding if available ... CBR/ABR coding with the
rate set to what is needed for the required size if not).


