<div dir="ltr">Thank Jonathan and Jean-Marc!<div><br></div><div>I attached the new patch sets in inner_prod_5patches_v3.zip.</div><div><br></div><div>The Chromebook I'm using is</div><div>Chromebook 13</div><div>CB5-311 series</div><div>RMN: Z3ENN</div><div><br></div><div>CPU info:</div><div><br></div><div><div>$ cat /proc/cpuinfo</div><div>processor<span class="gmail-Apple-tab-span" style="white-space:pre">      </span>: 0</div><div>model name<span class="gmail-Apple-tab-span" style="white-space:pre">  </span>: ARMv7 Processor rev 3 (v7l)</div><div>BogoMIPS<span class="gmail-Apple-tab-span" style="white-space:pre">  </span>: 2.31</div><div>Features<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae </div><div>CPU implementer<span class="gmail-Apple-tab-span" style="white-space:pre">       </span>: 0x41</div><div>CPU architecture: 7</div><div>CPU variant<span class="gmail-Apple-tab-span" style="white-space:pre">    </span>: 0x3</div><div>CPU part<span class="gmail-Apple-tab-span" style="white-space:pre">  </span>: 0xc0f</div><div>CPU revision<span class="gmail-Apple-tab-span" style="white-space:pre">    </span>: 3</div><div><br></div><div>Hardware<span class="gmail-Apple-tab-span" style="white-space:pre">   </span>: NVIDIA Tegra SoC (Flattened Device Tree)</div><div>Revision<span class="gmail-Apple-tab-span" style="white-space:pre">     </span>: 0000</div><div>Serial<span class="gmail-Apple-tab-span" style="white-space:pre">           </span>: 0000000000000000</div></div><div><br></div><div>Thanks,<br></div><div>Linfeng</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 6, 2017 at 1:15 PM, Jean-Marc Valin <span dir="ltr"><<a href="mailto:jmvalin@jmvalin.ca" target="_blank">jmvalin@jmvalin.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Linfeng,<br>

<span class=""><br>

On 06/06/17 04:09 PM, Jonathan Lennox wrote:<br>

> Two comments on the various infrastructure for RTCD etc.<br>

><br>

> 1. The 0002- patch changes the ABI of the celt_pitch_xcorr functions,<br>

> but doesn’t change the assembly in celt/arm/celt_pitch_xcorr_arm.<wbr>s<br>

> correspondingly.  I suspect the ‘arch’ parameter can just be ignored<br>

> by the assembly functions, but at least the comments in that file<br>

> should be updated to indicate the register that’s used to pass it in,<br>

> and that it’s ignored.<br>

><br>

> 2. In the 0003- patch, you shouldn’t use the MAY_HAVE_NEON macro in<br>

> your new arm_celt_map tables, for the same reason we didn’t want it<br>

> in the arm_silk_map tables.<br>

<br>

</span>I have no further issues with your patches, so once you address the two<br>

issues Jonathan pointed out, I'll be able to merge them.<br>

<br>

Cheers,<br>

<br>

        Jean-Marc<br>

<div class="HOEnZb"><div class="h5"><br>

><br>

> Out of curiosity, what’s the CPU in the Chromebook you’re using to<br>

> test?<br>

><br>

>> On Jun 1, 2017, at 6:33 PM, Linfeng Zhang <<a href="mailto:linfengz@google.com">linfengz@google.com</a>><br>

>> wrote:<br>

>><br>

>> Hi,<br>

>><br>

>> Attached are 5 patches related to celt_inner_prod() and<br>

>> dual_inner_prod() NEON intrinsics optimization.<br>

>><br>

>> In 0004-Optimize-floating-point-<wbr>celt_inner_prod-and-dual_inn.<wbr>patch,<br>

>> the optimization changed the order of floating-point inner<br>

>> products, which will change the results. I created<br>

>> celt_inner_prod_neon_float_c_<wbr>simulation() and<br>

>> dual_inner_prod_neon_float_c_<wbr>simulation() to simulate the order<br>

>> floating-point operations in NEON optimization and compare their<br>

>> results. Sorry that I cannot bond the distance between original C<br>

>> function and NEON function to any giving reasonable small number or<br>

>> ratio. It's easy to create an input which 0 and 1,000 are both<br>

>> correct results by just manipulating the inner product order.<br>

>><br>

>> The total speed gain is about 1.0% for fixed-point encoder, and<br>

>> 1.8% for floating-point encoder, in Complexity 8, tested on my<br>

>> Chromebook.<br>

>><br>

>> Thanks, Linfeng<br>

>> <0005-Clean-celt_pitch_xcorr_<wbr>float_neon.patch><0004-<wbr>Optimize-floating-point-celt_<wbr>inner_prod-and-dual_inn.patch><wbr><0003-Optimize-fixed-point-<wbr>celt_inner_prod-and-dual_<wbr>inner_.patch><0002-Replace-<wbr>call-of-celt_inner_prod_c-<wbr>step-2.patch><0001-Replace-<wbr>call-of-celt_inner_prod_c-<wbr>step-1.patch>_________________<wbr>______________________________<br>

>><br>

>><br>

opus mailing list<br>

>> <a href="mailto:opus@xiph.org">opus@xiph.org</a> <a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/mailman/<wbr>listinfo/opus</a><br>

><br>

> ______________________________<wbr>_________________ opus mailing list<br>

> <a href="mailto:opus@xiph.org">opus@xiph.org</a> <a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/mailman/<wbr>listinfo/opus</a><br>

><br>

</div></div></blockquote></div><br></div>