[theora-dev] Patch: fragment reconstruction MMX for GCC

Nils Pipenbrinck n.pipenbrinck at cubic.org
Sun Dec 30 14:01:18 PST 2007

Hi again,

I measured my fragment reconstructions against the compiler output from 
GCC and well - the new codes perform better, so I brushed up my gcc 
inline assembler skills and made a port.

Code is here: http://torus.untergrund.net/code/mmxfrag.c

All routines perform much better now. Inter2 alone got a speedup of 
factor 5 on Pentium-M. Athlon CPU's execute roughly 3 times faster. 
Hadn't had the chance to benchmark core2 though. It would be nice to 
hear if the code compiles on 64bit intel.

Regarding the MSVC patch I made a couple of days ago:

I found out how to get the macro-magic working with MSVC. The IDCT has 
already been ported and now looks exactly like the GCC version. I hope 
we get the maintainability issues solved that way. When I'm done the 
loop filter that way I'll try to resubmit my patch.


More information about the theora-dev mailing list