[theora] Fixed Quantizer - Fixed Quality

Tue Mar 25 17:25:21 PST 2003

Here's the problem:

> 2) Encoding with rate control as in single pass "Bitrate 
> control" will not
> lead to better quality than fixed quant (with the right value 
> of the fixed
> quant). Ratecontrol doens't know anything about "quality". It 
> will try to 
> reach more-or-less CBR. 
> 
> But somehow this is not a fair comparison, because how do you 
> determine 
> the right quantizer value? You have to look at the material, 
> so you have 
> extra information. 
> 
> ---------------------------------------------------------------
> 
> 3) Two-pass-encoding with varying quantizer can lead to better overall
> quality than fixed quantizer encoding. 
> 
> E.g.: Encode Barcelona with Quant 25, but Suzie with quant 8. 
> Total size will be similar: 
> 
> Suzie-Q8:  275442   +   Barcelona-Q25: 347980    =   Total 623422
> Suzie-Q20: 115378   +   Barcelona-Q20: 550760    =   Total 666138
> 
> But visual quality makes a real difference as you can see 
> from th other
> attached pictures: Barcelona-Q25 isn't too much worse than Q20.
> Suzie-Q8 is _much_ better than Q20. 
> 
> These are just examples, of course...

everything you say is basically true.  However, what you are not accounting for is that it is the job of the codec to define what "Q=8" means.  In the DIVX case, I would claim the codec is at fault for not accounting for the fact that some material will look terrible at Q=20, and redefining Q on that basis.  Your theory seems to be that this is the job of a hypothetical "2-pass encoder", but I don't see how multiple passes per se makes any difference.  It's an issue of where the logic resides.  How does any encoder, whether one-pass, 2-pass, or whatever, determine that the 'suzy' scenes need a different setting than the Barcelona clip to achieve subjectively similar quality?

I can tell you how this is usually dealt with in practice: most encoder apps provide modes where quality and bitrate can both be variable within some range.  In your example, we might say that Q can vary up to 25, but only if necessary to pull the bitrate down below some threshold.  Below that threshold, Q can go down (ie quality increases in your example) until the threshold bitrate is approximated.  2-pass encoders simply have more information on how to do this effectively (ie knowing that a simple scene is coming up, they can increase quality on the cut so you don't see an ugly transitional period of a few frames).  True CBR is basically this strategy rigorously enforced against a given transport speed and playback buffer model.

This sort of relates to the PSNR discussion in the following way: internally, when making various encoding choices (block type, quantizers), most video codecs simply use some variation of MSE (mean squared error, which is what PSNR is derived from), or more typically SAD (Sum of Absolute Differences), which is a very similar metric (but easier to calculate).  In either case, as has been discussed, the results of this approach do not correlate very well with perceived quality, especially when taken over varying types of source material (as your examples prove).

So, for my money, the codecs should be doing a better job of incorporating some intelligence to correlate their 'Q' values to actual perceived quality, rather than some arbitrary pixel difference value.  That way, fixed-Q could actually mean something useful.  I suspect that audio codecs, particularly Vorbis, do this intrinsically, because their internal psycho-accoustic models tend to be rather complex.  In the video world, for reasons that elude me, this is not the case.  I know of no codec that incorporates any useful psycho-visual model into its encoder (though there are encoding apps that sit on top of codecs that claim to do this).

IMSHO, this should be a major design goal of any improved Theora encoders we develop.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'theora-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.