[opus] [OPUS] celt_inner_prod() and dual_inner_prod() NEON intrinsics
jmvalin at jmvalin.ca
Fri Jun 2 18:26:29 UTC 2017
I'll look into your patches. Can you let me know what's the expected
effect on performance (if any) for each of your patches? Also, are these
all the patches you intend to merge for 1.2 or are there more upcoming ones?
On 01/06/17 06:33 PM, Linfeng Zhang wrote:
> Attached are 5 patches related to celt_inner_prod()
> and dual_inner_prod() NEON intrinsics optimization.
> In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, the
> optimization changed the order of floating-point inner products, which
> will change the results. I
> created celt_inner_prod_neon_float_c_simulation()
> and dual_inner_prod_neon_float_c_simulation() to simulate the order
> floating-point operations in NEON optimization and compare their
> results. Sorry that I cannot bond the distance between original C
> function and NEON function to any giving reasonable small number or
> ratio. It's easy to create an input which 0 and 1,000 are both correct
> results by just manipulating the inner product order.
> The total speed gain is about 1.0% for fixed-point encoder, and 1.8% for
> floating-point encoder, in Complexity 8, tested on my Chromebook.
> opus mailing list
> opus at xiph.org
More information about the opus