[flac-dev] About SSE intrinsincs in decoder
olivier tristan
o.tristan at uvi.net
Tue Jul 5 07:41:19 UTC 2022
You do not talk about the SSE 4.1 version in your bench.
Have you tried this use case ?
Thanks !
Le 04/07/2022 à 19:23, Martijn van Beurden a écrit :
> Op ma 4 jul. 2022 om 15:06 schreef olivier tristan <o.tristan at uvi.net>:
>
> While I can understand the rationale for manual assembly as 32
> bits x86
> is dead, it seems a greater deal to remove all optimization including
> intrinsic ones.
>
>
> Yes, it does seem a great deal to remove all optimization, but it
> really isn't. See the pull request associated with that change for
> more information: https://github.com/xiph/flac/pull/347 I did quite a
> bit of testing before merging this change, on two different CPUs, each
> with 3 different compilers, each with 4 variants of the
> non-intrinsics-accelerated functions. It turns out that there is no
> performance loss at all, and in many cases this change makes flac
> actually faster, not slower as one would expect.
>
> Maybe there should be a an opt in if you don't want to be included by
> default but some people including me don't want to see those
> optimization been removed ?
>
>
> There would be no advantage of that over keeping the original code: it
> still needs to be maintained and tested, even if it is hidden behind
> some configuration option. The only case where this patch could be
> problematic in terms of speed is when one compiles flac to be used on
> CPUs that do not support SSE2.
--
Olivier Tristan
Research & Development
www.uvi.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/flac-dev/attachments/20220705/b6d2c59e/attachment.htm>
More information about the flac-dev
mailing list