[flac-dev] Performance and precompute_partition_info_sums_32bit_asm_ia32_()

lvqcl lvqcl.mail at gmail.com
Tue Sep 17 11:21:42 PDT 2013

Previously I wrote that precompute_partition_info_sums_32bit_asm_ia32_() only
makes encoding slower. Now I managed to compile flac with GCC 4.8.1, with this
function enabled and disabled. NASM was enabled, SSE intrinsics disabled.
Then I added -msse option (so that all C code was compiled with -msse),
then -msse2 and so on.

Input file for test: 44.1kHz/16bit/stereo; best compression mode (flac -8); CPU = Core i7.

Here are the results (1st column: SSE instruction set, 2nd column: the state of
precompute_partition_info_sums_32bit_asm_ia32_(), 3rd column: encoding time
in seconds, smaller=better):

no SSE   disabled    53.9
no SSE    enabled    55.2

SSE1     disabled    53.9
SSE1      enabled    55.3

SSE2     disabled    51.9
SSE2      enabled    53.1

SSE3     disabled    51.8
SSE3      enabled    53.2

SSSE3    disabled    45.7
SSSE3     enabled    51.4

SSE41    disabled    46.1
SSE41     enabled    51.6

SSE42    disabled    46.1
SSE42     enabled    51.6


1) flac is always faster when precompute_partition_info_sums_32bit_asm_ia32_() is disabled.

2) Some C code benefits noticeably from SSSE3 instructions; at least when compiled with GCC 4.8.1.

More information about the flac-dev mailing list