[opus] [OPUS] celt_inner_prod() and dual_inner_prod() NEON intrinsics

Jean-Marc Valin jmvalin at jmvalin.ca
Mon Jun 5 19:49:35 UTC 2017


On 05/06/17 03:28 PM, Linfeng Zhang wrote:
> For fixed-point ARM, only
> 0003-Optimize-fixed-point-celt_inner_prod-and-dual_inner_.patch changes
> the performance.
> For floating-point ARM, only
> 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch changes the performance.

Got any numbers?

Cheers,

	Jean-Marc

> Patch 1 and 2 are code clean-up and can only affect x86 performance.
> Patch 5 has neglectable effect on floating-point ARM performance.
> 
> Thanks,
> Linfeng
> 
> On Fri, Jun 2, 2017 at 11:26 AM, Jean-Marc Valin <jmvalin at jmvalin.ca
> <mailto:jmvalin at jmvalin.ca>> wrote:
> 
>     Hi Linfeng,
> 
>     I'll look into your patches. Can you let me know what's the expected
>     effect on performance (if any) for each of your patches? Also, are these
>     all the patches you intend to merge for 1.2 or are there more
>     upcoming ones?
> 
>     Cheers,
> 
>             Jean-Marc
> 
>     On 01/06/17 06:33 PM, Linfeng Zhang wrote:
>     > Hi,
>     >
>     > Attached are 5 patches related to celt_inner_prod()
>     > and dual_inner_prod() NEON intrinsics optimization.
>     >
>     > In
>     0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, the
>     > optimization changed the order of floating-point inner products, which
>     > will change the results. I
>     > created celt_inner_prod_neon_float_c_simulation()
>     > and dual_inner_prod_neon_float_c_simulation() to simulate the order
>     > floating-point operations in NEON optimization and compare their
>     > results. Sorry that I cannot bond the distance between original C
>     > function and NEON function to any giving reasonable small number or
>     > ratio. It's easy to create an input which 0 and 1,000 are both correct
>     > results by just manipulating the inner product order.
>     >
>     > The total speed gain is about 1.0% for fixed-point encoder, and
>     1.8% for
>     > floating-point encoder, in Complexity 8, tested on my Chromebook.
>     >
>     > Thanks,
>     > Linfeng
>     >
>     >
>     > _______________________________________________
>     > opus mailing list
>     > opus at xiph.org <mailto:opus at xiph.org>
>     > http://lists.xiph.org/mailman/listinfo/opus
>     <http://lists.xiph.org/mailman/listinfo/opus>
>     >
> 
> 


More information about the opus mailing list