[theora-dev] Patch: fragment reconstruction MMX for GCC

Ralph Giles giles at xiph.org
Sun Dec 30 17:41:45 PST 2007

On Sun, Dec 30, 2007 at 11:01:18PM +0100, Nils Pipenbrinck wrote:

> I measured my fragment reconstructions against the compiler output from 
> GCC and well - the new codes perform better, so I brushed up my gcc 
> inline assembler skills and made a port.


> All routines perform much better now. Inter2 alone got a speedup of 
> factor 5 on Pentium-M. Athlon CPU's execute roughly 3 times faster. 
> Hadn't had the chance to benchmark core2 though. It would be nice to 
> hear if the code compiles on 64bit intel.

gcc 4.1.3 on x86_64:

 gcc -DHAVE_CONFIG_H -I. -I.. -I../include -I../lib -I../lib/dec 
-I../lib/enc -Wall -Wno-parentheses -O3 -fforce-addr 
-fomit-frame-pointer -finline-functions -funroll-loops -MT mmxfrag.lo 
-MD -MP -MF .deps/mmxfrag.Tpo -c dec/x86/mmxfrag.c  -fPIC -DPIC -o 
dec/x86/mmxfrag.c: In function 'oc_frag_recon_inter2_mmx':
dec/x86/mmxfrag.c:197: error: memory input 6 is not directly addressable
dec/x86/mmxfrag.c:197: error: memory input 7 is not directly addressable
make[2]: *** [mmxfrag.lo] Error 1


More information about the theora-dev mailing list