> It turns out that int64 shift is quite slow...
> This patch changes the code from:
> 	(FLAC__int32)(xmm.m128i_i64[0] >> lp_quantization)
> into:
> 	_mm_cvtsi128_si32(_mm_srli_epi64(xmm, lp_quantization));
> Encoding of 24-bit .wav files with 32-bit FLAC became noticeably faster.
> The new code works only if quantization <= 32, but its max value is 15 so the code always work.
> (max_shiftlimit == (1 << (FLAC__SUBFRAME_LPC_QLP_SHIFT_LEN-1)) - 1  == 15)

I think you forgot to attach the patch for this one :-).

