[flac-dev] About SSE intrinsincs in decoder

olivier tristan o.tristan at uvi.net
Tue Jul 5 07:41:19 UTC 2022


You do not talk about the SSE 4.1 version in your bench.

Have you tried this use case ?

Thanks !

Le 04/07/2022 à 19:23, Martijn van Beurden a écrit :
> Op ma 4 jul. 2022 om 15:06 schreef olivier tristan <o.tristan at uvi.net>:
>
>     While I can understand the rationale for manual assembly as 32
>     bits x86
>     is dead, it seems a greater deal to remove all optimization including
>     intrinsic ones.
>
>
> Yes, it does seem a great deal to remove all optimization, but it 
> really isn't. See the pull request associated with that change for 
> more information: https://github.com/xiph/flac/pull/347 I did quite a 
> bit of testing before merging this change, on two different CPUs, each 
> with 3 different compilers, each with 4 variants of the 
> non-intrinsics-accelerated functions. It turns out that there is no 
> performance loss at all, and in many cases this change makes flac 
> actually faster, not slower as one would expect.
>
>     Maybe there should be a an opt in if you don't want to be included by
>     default but some people including me don't want to see those
>     optimization been removed ?
>
>
> There would be no advantage of that over keeping the original code: it 
> still needs to be maintained and tested, even if it is hidden behind 
> some configuration option. The only case where this patch could be 
> problematic in terms of speed is when one compiles flac to be used on 
> CPUs that do not support SSE2.

-- 
Olivier Tristan
Research & Development
www.uvi.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/flac-dev/attachments/20220705/b6d2c59e/attachment.htm>


More information about the flac-dev mailing list