[theora-dev] Patch: fragment reconstruction MMX for GCC
giles at xiph.org
Sun Dec 30 17:41:45 PST 2007
On Sun, Dec 30, 2007 at 11:01:18PM +0100, Nils Pipenbrinck wrote:
> I measured my fragment reconstructions against the compiler output from
> GCC and well - the new codes perform better, so I brushed up my gcc
> inline assembler skills and made a port.
> All routines perform much better now. Inter2 alone got a speedup of
> factor 5 on Pentium-M. Athlon CPU's execute roughly 3 times faster.
> Hadn't had the chance to benchmark core2 though. It would be nice to
> hear if the code compiles on 64bit intel.
gcc 4.1.3 on x86_64:
gcc -DHAVE_CONFIG_H -I. -I.. -I../include -I../lib -I../lib/dec
-I../lib/enc -Wall -Wno-parentheses -O3 -fforce-addr
-fomit-frame-pointer -finline-functions -funroll-loops -MT mmxfrag.lo
-MD -MP -MF .deps/mmxfrag.Tpo -c dec/x86/mmxfrag.c -fPIC -DPIC -o
dec/x86/mmxfrag.c: In function 'oc_frag_recon_inter2_mmx':
dec/x86/mmxfrag.c:197: error: memory input 6 is not directly addressable
dec/x86/mmxfrag.c:197: error: memory input 7 is not directly addressable
make: *** [mmxfrag.lo] Error 1
More information about the theora-dev