[opus] [OPUS] celt_inner_prod() and dual_inner_prod() NEON intrinsics

Jean-Marc Valin jmvalin at jmvalin.ca
Tue Jun 6 20:15:05 UTC 2017

Hi Linfeng,

On 06/06/17 04:09 PM, Jonathan Lennox wrote:
> Two comments on the various infrastructure for RTCD etc.
> 1. The 0002- patch changes the ABI of the celt_pitch_xcorr functions,
> but doesn’t change the assembly in celt/arm/celt_pitch_xcorr_arm.s
> correspondingly.  I suspect the ‘arch’ parameter can just be ignored
> by the assembly functions, but at least the comments in that file
> should be updated to indicate the register that’s used to pass it in,
> and that it’s ignored.
> 2. In the 0003- patch, you shouldn’t use the MAY_HAVE_NEON macro in
> your new arm_celt_map tables, for the same reason we didn’t want it
> in the arm_silk_map tables.

I have no further issues with your patches, so once you address the two
issues Jonathan pointed out, I'll be able to merge them.



> Out of curiosity, what’s the CPU in the Chromebook you’re using to
> test?
>> On Jun 1, 2017, at 6:33 PM, Linfeng Zhang <linfengz at google.com>
>> wrote:
>> Hi,
>> Attached are 5 patches related to celt_inner_prod() and
>> dual_inner_prod() NEON intrinsics optimization.
>> In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch,
>> the optimization changed the order of floating-point inner
>> products, which will change the results. I created
>> celt_inner_prod_neon_float_c_simulation() and
>> dual_inner_prod_neon_float_c_simulation() to simulate the order
>> floating-point operations in NEON optimization and compare their
>> results. Sorry that I cannot bond the distance between original C
>> function and NEON function to any giving reasonable small number or
>> ratio. It's easy to create an input which 0 and 1,000 are both
>> correct results by just manipulating the inner product order.
>> The total speed gain is about 1.0% for fixed-point encoder, and
>> 1.8% for floating-point encoder, in Complexity 8, tested on my
>> Chromebook.
>> Thanks, Linfeng 
>> <0005-Clean-celt_pitch_xcorr_float_neon.patch><0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch><0003-Optimize-fixed-point-celt_inner_prod-and-dual_inner_.patch><0002-Replace-call-of-celt_inner_prod_c-step-2.patch><0001-Replace-call-of-celt_inner_prod_c-step-1.patch>_______________________________________________
opus mailing list
>> opus at xiph.org http://lists.xiph.org/mailman/listinfo/opus
> _______________________________________________ opus mailing list 
> opus at xiph.org http://lists.xiph.org/mailman/listinfo/opus

More information about the opus mailing list