[Speex-dev] MKL Patch
Thorvald Natvig
thorvald at natvig.com
Tue May 27 15:52:39 PDT 2008
I did some benchmarking. On a P4 Xeon 3.4 Ghz, 64-bit, GCC 4.3, -O3
-fprofile-use -ffast-math -ftree-vectorize, running the preprocessor
with 320 sample frames (16khz):
KISS: 89us / frame
Small: 88us / frame
FFTW3: 76 us / frame
MKL: 75 us / frame
According to callgrind, with MKL 18.4% of CPU time is spent in the FFTs
vs 37.9% with Small, meaning those previous numbers are for a different
architecture, compiler or optimization flags.
So, in reality, we're saving only 20% off the preprocessor time. Looking
at the output, you can shave 50% off what remains if you SSE optimize
the entire code (and I'm not doing that ;))
More information about the Speex-dev
mailing list