[flac-dev] PATCH for replaygain_synthesis

Sun May 18 03:18:57 PDT 2014

I've not benchmarked to know if their is any real benefit, but changing 
the include in fast_float_math_hack.h to <mathimf.h> is all that is 
required to use the latest ICC.

John

On 17/05/2014 10:26, lvqcl wrote:
> The file
> src/share/replaygain_synthesis/include/private/fast_float_math_hack.h
> redefines 'tanh' as 'tanhf'. This file is intended for Intel Compiler only,
> but it includes outdated mathf.h and doesn't work with current versions
> of ICC.
>
> The fixes are trivial though, and I compiled 2 versions of flac.exe:
> with this
> 'hack' turned off an on. The difference in decoding speed is very close to
> measurement inaccuracy: for 32-bit encoder the decoding time decreases
> from 94.5s
> to 94.0s, for 64-bit it increases from 82.6s to 82.9s.
> (the option for this test was:
> --apply-replaygain-which-is-not-lossless=Ln0)
>
> So this hack is really useless today, and the first patch removes
> fast_float_math_hack.h from the sources.
>
>
>
>
> MSVS profiler shows that tanh calculation doesn't require too much CPU
> resources,
> the real problem is an integer division (int_64/int_32) in this line:
>
>      val64 = dither_output_(........) / conv_factor;
>
> Since all possible values of conv_factor are powers of 2, it's possible to
> replace division with shift. The second patch does this.
>
> Decoding time decreases from 94.5s to 64.1s for 32-bit ICC compile, and
> from 82.6s to 50.0s for 64-bit ICC compile.
>
>
>
> *************************************************
> P.S. Actually, shift ( x << n ) and division ( x / (1<<n) ) can give
> different results if x < 0. The difference is very small though: WAV files
> differ by 1 LSB. And probably shift gives better results than division.
>
> Let's compare shift by 2 and division by (1<<2) == 4:
>
> *** shift ***
> argument            result
> ....
> 12, 13, 14, 15  ->    3
>   8,  9, 10, 11  ->    2
>   4,  5,  6,  7  ->    1
>   0,  1,  2,  3  ->    0
> -4, -3, -2, -1  ->   -1
> -8, -7, -6, -5  ->   -2
> ....
>
> *** division ***
> argument                       result
> ....
> 12, 13, 14, 15              ->    3
>   8,  9, 10, 11              ->    2
>   4,  5,  6,  7              ->    1
> -3, -2, -1,  0,  1,  2,  3  ->    0
> -7, -6, -5, -4  ->          ->   -1
> -11,-10,-9, -8  ->          ->   -2
> ....
>
>
> So, shift results in small DC offset (1/2 LSB), division results in
> small 'nonlinearity' near 0.
>
>
> _______________________________________________
> flac-dev mailing list
> flac-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/flac-dev
>