<div dir="ltr">Thank Jonathan and Jean-Marc!<div><br></div><div>I attached the new patch sets in inner_prod_5patches_v3.zip.</div><div><br></div><div>The Chromebook I'm using is</div><div>Chromebook 13</div><div>CB5-311 series</div><div>RMN: Z3ENN</div><div><br></div><div>CPU info:</div><div><br></div><div><div>$ cat /proc/cpuinfo</div><div>processor<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: 0</div><div>model name<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: ARMv7 Processor rev 3 (v7l)</div><div>BogoMIPS<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: 2.31</div><div>Features<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: swp half thumb fastmult vfp edsp thumbee neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae </div><div>CPU implementer<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: 0x41</div><div>CPU architecture: 7</div><div>CPU variant<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: 0x3</div><div>CPU part<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: 0xc0f</div><div>CPU revision<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: 3</div><div><br></div><div>Hardware<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: NVIDIA Tegra SoC (Flattened Device Tree)</div><div>Revision<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: 0000</div><div>Serial<span class="gmail-Apple-tab-span" style="white-space:pre"> </span>: 0000000000000000</div></div><div><br></div><div>Thanks,<br></div><div>Linfeng</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 6, 2017 at 1:15 PM, Jean-Marc Valin <span dir="ltr"><<a href="mailto:jmvalin@jmvalin.ca" target="_blank">jmvalin@jmvalin.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Linfeng,<br>
<span class=""><br>
On 06/06/17 04:09 PM, Jonathan Lennox wrote:<br>
> Two comments on the various infrastructure for RTCD etc.<br>
><br>
> 1. The 0002- patch changes the ABI of the celt_pitch_xcorr functions,<br>
> but doesn’t change the assembly in celt/arm/celt_pitch_xcorr_arm.<wbr>s<br>
> correspondingly. I suspect the ‘arch’ parameter can just be ignored<br>
> by the assembly functions, but at least the comments in that file<br>
> should be updated to indicate the register that’s used to pass it in,<br>
> and that it’s ignored.<br>
><br>
> 2. In the 0003- patch, you shouldn’t use the MAY_HAVE_NEON macro in<br>
> your new arm_celt_map tables, for the same reason we didn’t want it<br>
> in the arm_silk_map tables.<br>
<br>
</span>I have no further issues with your patches, so once you address the two<br>
issues Jonathan pointed out, I'll be able to merge them.<br>
<br>
Cheers,<br>
<br>
Jean-Marc<br>
<div class="HOEnZb"><div class="h5"><br>
><br>
> Out of curiosity, what’s the CPU in the Chromebook you’re using to<br>
> test?<br>
><br>
>> On Jun 1, 2017, at 6:33 PM, Linfeng Zhang <<a href="mailto:linfengz@google.com">linfengz@google.com</a>><br>
>> wrote:<br>
>><br>
>> Hi,<br>
>><br>
>> Attached are 5 patches related to celt_inner_prod() and<br>
>> dual_inner_prod() NEON intrinsics optimization.<br>
>><br>
>> In 0004-Optimize-floating-point-<wbr>celt_inner_prod-and-dual_inn.<wbr>patch,<br>
>> the optimization changed the order of floating-point inner<br>
>> products, which will change the results. I created<br>
>> celt_inner_prod_neon_float_c_<wbr>simulation() and<br>
>> dual_inner_prod_neon_float_c_<wbr>simulation() to simulate the order<br>
>> floating-point operations in NEON optimization and compare their<br>
>> results. Sorry that I cannot bond the distance between original C<br>
>> function and NEON function to any giving reasonable small number or<br>
>> ratio. It's easy to create an input which 0 and 1,000 are both<br>
>> correct results by just manipulating the inner product order.<br>
>><br>
>> The total speed gain is about 1.0% for fixed-point encoder, and<br>
>> 1.8% for floating-point encoder, in Complexity 8, tested on my<br>
>> Chromebook.<br>
>><br>
>> Thanks, Linfeng<br>
>> <0005-Clean-celt_pitch_xcorr_<wbr>float_neon.patch><0004-<wbr>Optimize-floating-point-celt_<wbr>inner_prod-and-dual_inn.patch><wbr><0003-Optimize-fixed-point-<wbr>celt_inner_prod-and-dual_<wbr>inner_.patch><0002-Replace-<wbr>call-of-celt_inner_prod_c-<wbr>step-2.patch><0001-Replace-<wbr>call-of-celt_inner_prod_c-<wbr>step-1.patch>_________________<wbr>______________________________<br>
>><br>
>><br>
opus mailing list<br>
>> <a href="mailto:opus@xiph.org">opus@xiph.org</a> <a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/mailman/<wbr>listinfo/opus</a><br>
><br>
> ______________________________<wbr>_________________ opus mailing list<br>
> <a href="mailto:opus@xiph.org">opus@xiph.org</a> <a href="http://lists.xiph.org/mailman/listinfo/opus" rel="noreferrer" target="_blank">http://lists.xiph.org/mailman/<wbr>listinfo/opus</a><br>
><br>
</div></div></blockquote></div><br></div>