[flac-dev] About SSE intrinsincs in decoder
    Martijn van Beurden 
    mvanb1 at gmail.com
       
    Mon Jul  4 17:23:26 UTC 2022
    
    
  
Op ma 4 jul. 2022 om 15:06 schreef olivier tristan <o.tristan at uvi.net>:
> While I can understand the rationale for manual assembly as 32 bits x86
> is dead, it seems a greater deal to remove all optimization including
> intrinsic ones.
>
Yes, it does seem a great deal to remove all optimization, but it really
isn't. See the pull request associated with that change for more
information: https://github.com/xiph/flac/pull/347 I did quite a bit of
testing before merging this change, on two different CPUs, each with 3
different compilers, each with 4 variants of the non-intrinsics-accelerated
functions. It turns out that there is no performance loss at all, and in
many cases this change makes flac actually faster, not slower as one would
expect.
> Maybe there should be a an opt in if you don't want to be included by
> default but some people including me don't want to see those
> optimization been removed ?
>
There would be no advantage of that over keeping the original code: it
still needs to be maintained and tested, even if it is hidden behind some
configuration option. The only case where this patch could be problematic
in terms of speed is when one compiles flac to be used on CPUs that do not
support SSE2.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/flac-dev/attachments/20220704/40d5d327/attachment.htm>
    
    
More information about the flac-dev
mailing list