[flac-dev] Performance tests on POWER8 or POWER9

Martijn van Beurden mvanb1 at gmail.com
Sat Dec 3 09:49:56 UTC 2022


Hi all,

I've tried once again to get performance figures through Travis CI for
powerpc. It took quite a few tries, but I got a clean run here:
https://app.travis-ci.com/github/ktmf01/flac/builds/258454024

3 different builds are compared: one with default configuration, one
with arch-specific optimizations turned off and one with arch-specific
optimizations and associative math disabled (which is required for
autovectorization). If you open the build logs, there are at the
bottom 8 execution times of the same test. In many Travis runs these
figures varied widely, sometimes even varying by as much as 100%.
However, the linked results are in a very narrow margin so they should
be dependable. They are also the lowest numbers I've found overall.

The results are actually the inverse of what you'd expect. The build
with all (auto)vectorization disabled is the fastest with a time of
23.1 sec. The one with autovectorization (so with arch-specific
optimization disabled) is quite a bit slower at 23.9 sec. The default
build with arch-specific optimization is the slowest at 24.0 sec. This
difference is obviously too small to be conclusive.

So, while there is some uncertainty as to which build is the fastest,
it seems the PPC specific code brings no improvement whatsoever.

My plan is to remove all PPC specific code unless someone can convince
me otherwise in the next few weeks. This removal has the added benefit
of reducing the amount of code that is not being regularly tested and
fuzzed.

Kind regards,

Martijn van Beurden

Op za 26 nov. 2022 om 18:12 schreef Martijn van Beurden <mvanb1 at gmail.com>:

>
> Hi all,
>
> Last year the POWER8 and POWER9 specific improvements (for PowerPC)
> were completely rewritten, but as of yet no accurate performance tests
> of these have been performed. I have validated functionality and rough
> performance checks for these through Travis CI, but the numbers I get
> through that vary wildly.
>
> Recently the C code that these improvements mirror has been changed to
> allow compilers to autovectorize it. It would be nice if someone with
> access to POWER hardware could compare builds with and without POWER
> specific improvements. In other words, a plain build and one with
> either --disable-asm-optimizations (for autotools build) or
> -DWITH_ASM=0 (for CMake).
>
> If anyone could do such a comparison, that would be great!
>
> Kind regards,
>
> Martijn van Beurden


More information about the flac-dev mailing list