[flac-dev] PATCH for replaygain_synthesis

Sat May 17 02:26:17 PDT 2014

The file src/share/replaygain_synthesis/include/private/fast_float_math_hack.h
redefines 'tanh' as 'tanhf'. This file is intended for Intel Compiler only,
but it includes outdated mathf.h and doesn't work with current versions of ICC.

The fixes are trivial though, and I compiled 2 versions of flac.exe: with this
'hack' turned off an on. The difference in decoding speed is very close to
measurement inaccuracy: for 32-bit encoder the decoding time decreases from 94.5s
to 94.0s, for 64-bit it increases from 82.6s to 82.9s.
(the option for this test was: --apply-replaygain-which-is-not-lossless=Ln0)

So this hack is really useless today, and the first patch removes
fast_float_math_hack.h from the sources.

MSVS profiler shows that tanh calculation doesn't require too much CPU resources,
the real problem is an integer division (int_64/int_32) in this line:

     val64 = dither_output_(........) / conv_factor;

Since all possible values of conv_factor are powers of 2, it's possible to
replace division with shift. The second patch does this.

Decoding time decreases from 94.5s to 64.1s for 32-bit ICC compile, and
 from 82.6s to 50.0s for 64-bit ICC compile.

*************************************************
P.S. Actually, shift ( x << n ) and division ( x / (1<<n) ) can give
different results if x < 0. The difference is very small though: WAV files
differ by 1 LSB. And probably shift gives better results than division.

Let's compare shift by 2 and division by (1<<2) == 4:

*** shift ***
argument            result
....
12, 13, 14, 15  ->    3
  8,  9, 10, 11  ->    2
  4,  5,  6,  7  ->    1
  0,  1,  2,  3  ->    0
-4, -3, -2, -1  ->   -1
-8, -7, -6, -5  ->   -2
....

*** division ***
argument                       result
....
12, 13, 14, 15              ->    3
  8,  9, 10, 11              ->    2
  4,  5,  6,  7              ->    1
-3, -2, -1,  0,  1,  2,  3  ->    0
-7, -6, -5, -4  ->          ->   -1
-11,-10,-9, -8  ->          ->   -2
....

So, shift results in small DC offset (1/2 LSB), division results in
small 'nonlinearity' near 0.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1__remove_ffmhack.patch
Type: application/octet-stream
Size: 2594 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/flac-dev/attachments/20140517/e2f5dcb0/attachment.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2__apply_gain.patch
Type: application/octet-stream
Size: 3073 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/flac-dev/attachments/20140517/e2f5dcb0/attachment-0001.obj