[Theora-dev] Re: MMX/mmxext optimisations

Wed Aug 25 15:51:10 PDT 2004

On Wednesday 25 August 2004 23:07, Ralph Giles wrote:
> On Wed, Aug 25, 2004 at 10:10:22PM +0100, Jim Darby wrote:
[...]> > This is a little worrying. Surely the MMX code should just make 
things
> > happen faster rather than different?
>
> Indeed. Of course it's easy to fool yourself, but particularly because
> errors are cumulative between I frames, the in-loop stuff does need
> to be per spec.
>
> Wim, can you confirm you didn't include any short cuts?

There is at least one reason for slightly varying output, the C implementation 
of averaging in theora code is:
#define DSP_OP_AVG(a,b) ((((int)(a)) + ((int)(b)))/2)
while the MMXEXT psavgb assembly opcode computation is equivalent to the 
following:
((((int)(a)) + ((int)(b) + 1))/2)

The latter is in fact a little more precise (the internal +1 is done with 
9bits precision in the CPU) for averaging. I guess it does not really pertub 
the output. In my attempts last years, with back to back testing, I hit this 
case and, finally, the output encoded files were (slightly) smaller with the 
MMX version than with the C version. Anyway, nothing was visible.

Rodolphe