<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div class="">Semantically, OPUS_ARM_MAY_HAVE_NEON is supposed to mean the compiler supports, and the CPU may support, Neon assembly code, which isn’t necessarily the same thing as the compiler supporting Neon intrinsics.  (The Visual Studio ARM compiler, for
 instance, supports intrinsics but not assembly.)  So I don’t think this patch is the right solution.</div>
<div class=""><br class="">
</div>
<div class="">Instead, I think the problem is actually that silk/arm/arm_silk_map.c uses the MAY_HAVE_NEON macro, which it shouldn’t be using.  If that file were changed so that the jump tables just listed the _neon versions of the functions directly, you’d
 get the speedup you’re looking for.</div>
<div class=""><br class="">
</div>
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On Jun 1, 2017, at 6:03 PM, Linfeng Zhang <<a href="mailto:linfengz@google.com" class="">linfengz@google.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div dir="ltr" class="">Thank Jean-Mark and Jonathan!
<div class=""><br class="">
</div>
<div class="">I tested current OPUS encoder in floating-point with Complexity 8. Hacking using the attached patch (which will generate "#define OPUS_ARM_MAY_HAVE_NEON 1" in config.h) will speed up about 14.7% on my Chromebook. Probably it's because many NEON
 intrinsics optimizations can benefit both fixed-point and floating-point encoder.</div>
<div class=""><br class="">
</div>
<div class="">So if it's safe enough to enable MAY_HAVE_NEON in floating-point by default, it could speed up floating-point NEON encoder a little bit.</div>
<div class=""><br class="">
</div>
<div class="">Thanks,</div>
<div class="">Linfeng</div>
</div>
<div class="gmail_extra"><br class="">
<div class="gmail_quote">On Thu, Jun 1, 2017 at 2:22 PM, Jonathan Lennox <span dir="ltr" class="">
<<a href="mailto:jonathan@vidyo.com" target="_blank" class="">jonathan@vidyo.com</a>></span> wrote:<br class="">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word" class="">
<div style="word-wrap:break-word" class="">
<div class="">
<div class="h5">
<div class="">
<blockquote type="cite" class="">
<div class=""><br class="">
On May 31, 2017, at 12:47 PM, Linfeng Zhang <<a href="mailto:linfengz@google.com" target="_blank" class="">linfengz@google.com</a>> wrote:</div>
<br class="m_-8233522856382135630Apple-interchange-newline">
<div class="">
<div dir="ltr" class="">
<div style="font-size:12.8px" class="">Hi,</div>
<div style="font-size:12.8px" class=""><br class="">
</div>
<div style="font-size:12.8px" class=""><span style="font-size:10pt;font-family:arial" class="">./configure --build x86_64-unknown-linux-gnu --host arm-linux-gnueabihf --disable-assertions --disable-check-asm --enable-intrinsics CFLAGS=-O3 --disable-shared</span><br class="">
</div>
<div style="font-size:12.8px" class=""><span style="font-size:10pt;font-family:arial" class=""><br class="">
</span></div>
<div style="font-size:12.8px" class="">When configuring with floating-point and intrinsics enabled as above, the generated <span style="font-size:12.8px" class="">config.h only has OPUS_ARM_MAY_HAVE_NEON_INTR defined (to 1), with</span></div>
<div style="font-size:12.8px" class="">
<div class="">
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_ASM */</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_INLINE_ASM */<br class="">
</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_INLINE_EDSP */<br class="">
</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_INLINE_MEDIA */<br class="">
</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_INLINE_NEON */<br class="">
</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_MAY_HAVE_EDSP */<br class="">
</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_MAY_HAVE_MEDIA */<br class="">
</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_MAY_HAVE_NEON */<br class="">
</div>
</div>
<div class="">
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_PRESUME_<wbr class="">AARCH64_NEON_INTR */</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_PRESUME_EDSP */<br class="">
</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> <span class="m_-8233522856382135630gmail-il">OPUS_ARM_PRESUME_MEDIA</span> <wbr class="">*/<br class="">
</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_PRESUME_NEON */<br class="">
</div>
<div class="">/* #<span class="m_-8233522856382135630gmail-il">undef</span> OPUS_ARM_PRESUME_NEON_<wbr class="">INTR */<br class="">
</div>
</div>
</div>
<div style="font-size:12.8px" class=""><br class="">
</div>
<div style="font-size:12.8px" class=""><font class="">So MAY_HAVE_NEON will be defined to <span class="m_-8233522856382135630gmail-il">MEDIA</span> version, which will eventually fall down to C functions in the jump table:<br class="">
</font></div>
<div style="font-size:12.8px" class=""><font class="">#  define MAY_HAVE_NEON(name) MAY_HAVE_MEDIA(name)<br class="">
</font></div>
<div style="font-size:12.8px" class=""><font class=""><br class="">
</font></div>
<div style="font-size:12.8px" class=""><font class="">Therefore all NEON intrinsics optimizations in their jump tables won't get called for floating-point.</font></div>
<div style="font-size:12.8px" class=""><br class="">
</div>
<div style="font-size:12.8px" class="">Am I missing some options in my configure command, or the config is intend to do so in floating-point?</div>
<div style="font-size:12.8px" class=""><br class="">
</div>
<div style="font-size:12.8px" class="">Thanks,</div>
<div style="font-size:12.8px" class="">Linfeng</div>
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
<div class="">The structure of this is pretty tangled and confusing, but what you’ll find is that the MAY_HAVE_NEON macro isn’t used in the jump tables for the two Neon intrinsics functions (silk_NSQ_noise_shape_<wbr class="">feedback_loop_neon and celt_pitch_xcorr_float_<wbr class="">neon)
 which are used in a floating-point neon build.  See silk/arm/arm_silk_map.c and celt/arm/arm_celt_map.c.</div>
<div class=""><br class="">
</div>
<div class="">So long as OPUS_ARM_MAY_HAVE_NEON_INTR and OPUS_HAVE_RTCD are set in config.h, it’ll pick up those functions, and check for them using RTCD.</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
<br class="">
</body>
</html>