[opus] [OPUS] celt_inner_prod() and dual_inner_prod() NEON intrinsics

Linfeng Zhang linfengz at google.com
Tue Jun 6 21:04:38 UTC 2017


Thank Jonathan and Jean-Marc!

I attached the new patch sets in inner_prod_5patches_v3.zip.

The Chromebook I'm using is
Chromebook 13
CB5-311 series
RMN: Z3ENN

CPU info:

$ cat /proc/cpuinfo
processor : 0
model name : ARMv7 Processor rev 3 (v7l)
BogoMIPS : 2.31
Features : swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4
idiva idivt vfpd32 lpae
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x3
CPU part : 0xc0f
CPU revision : 3

Hardware : NVIDIA Tegra SoC (Flattened Device Tree)
Revision : 0000
Serial : 0000000000000000

Thanks,
Linfeng

On Tue, Jun 6, 2017 at 1:15 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote:

> Hi Linfeng,
>
> On 06/06/17 04:09 PM, Jonathan Lennox wrote:
> > Two comments on the various infrastructure for RTCD etc.
> >
> > 1. The 0002- patch changes the ABI of the celt_pitch_xcorr functions,
> > but doesn’t change the assembly in celt/arm/celt_pitch_xcorr_arm.s
> > correspondingly.  I suspect the ‘arch’ parameter can just be ignored
> > by the assembly functions, but at least the comments in that file
> > should be updated to indicate the register that’s used to pass it in,
> > and that it’s ignored.
> >
> > 2. In the 0003- patch, you shouldn’t use the MAY_HAVE_NEON macro in
> > your new arm_celt_map tables, for the same reason we didn’t want it
> > in the arm_silk_map tables.
>
> I have no further issues with your patches, so once you address the two
> issues Jonathan pointed out, I'll be able to merge them.
>
> Cheers,
>
>         Jean-Marc
>
> >
> > Out of curiosity, what’s the CPU in the Chromebook you’re using to
> > test?
> >
> >> On Jun 1, 2017, at 6:33 PM, Linfeng Zhang <linfengz at google.com>
> >> wrote:
> >>
> >> Hi,
> >>
> >> Attached are 5 patches related to celt_inner_prod() and
> >> dual_inner_prod() NEON intrinsics optimization.
> >>
> >> In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch,
> >> the optimization changed the order of floating-point inner
> >> products, which will change the results. I created
> >> celt_inner_prod_neon_float_c_simulation() and
> >> dual_inner_prod_neon_float_c_simulation() to simulate the order
> >> floating-point operations in NEON optimization and compare their
> >> results. Sorry that I cannot bond the distance between original C
> >> function and NEON function to any giving reasonable small number or
> >> ratio. It's easy to create an input which 0 and 1,000 are both
> >> correct results by just manipulating the inner product order.
> >>
> >> The total speed gain is about 1.0% for fixed-point encoder, and
> >> 1.8% for floating-point encoder, in Complexity 8, tested on my
> >> Chromebook.
> >>
> >> Thanks, Linfeng
> >> <0005-Clean-celt_pitch_xcorr_float_neon.patch><0004-
> Optimize-floating-point-celt_inner_prod-and-dual_inn.patch>
> <0003-Optimize-fixed-point-celt_inner_prod-and-dual_
> inner_.patch><0002-Replace-call-of-celt_inner_prod_c-
> step-2.patch><0001-Replace-call-of-celt_inner_prod_c-
> step-1.patch>_______________________________________________
> >>
> >>
> opus mailing list
> >> opus at xiph.org http://lists.xiph.org/mailman/listinfo/opus
> >
> > _______________________________________________ opus mailing list
> > opus at xiph.org http://lists.xiph.org/mailman/listinfo/opus
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/opus/attachments/20170606/8af9ff04/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: inner_prod_5patches_v3.zip
Type: application/zip
Size: 11294 bytes
Desc: not available
URL: <http://lists.xiph.org/pipermail/opus/attachments/20170606/8af9ff04/attachment-0001.zip>


More information about the opus mailing list