[opus] Opus floating-point NEON jump table question
Linfeng Zhang
linfengz at google.com
Fri Jun 2 01:16:33 UTC 2017
Thank Jonathan!
I'll fix the MAY_HAVE_NEON() in silk/arm/arm_silk_map.c
Linfeng
On Thu, Jun 1, 2017 at 3:34 PM, Jonathan Lennox <jonathan at vidyo.com> wrote:
> Semantically, OPUS_ARM_MAY_HAVE_NEON is supposed to mean the compiler
> supports, and the CPU may support, Neon assembly code, which isn’t
> necessarily the same thing as the compiler supporting Neon intrinsics.
> (The Visual Studio ARM compiler, for instance, supports intrinsics but not
> assembly.) So I don’t think this patch is the right solution.
>
> Instead, I think the problem is actually that silk/arm/arm_silk_map.c uses
> the MAY_HAVE_NEON macro, which it shouldn’t be using. If that file were
> changed so that the jump tables just listed the _neon versions of the
> functions directly, you’d get the speedup you’re looking for.
>
>
> On Jun 1, 2017, at 6:03 PM, Linfeng Zhang <linfengz at google.com> wrote:
>
> Thank Jean-Mark and Jonathan!
>
> I tested current OPUS encoder in floating-point with Complexity 8. Hacking
> using the attached patch (which will generate "#define
> OPUS_ARM_MAY_HAVE_NEON 1" in config.h) will speed up about 14.7% on my
> Chromebook. Probably it's because many NEON intrinsics optimizations can
> benefit both fixed-point and floating-point encoder.
>
> So if it's safe enough to enable MAY_HAVE_NEON in floating-point by
> default, it could speed up floating-point NEON encoder a little bit.
>
> Thanks,
> Linfeng
>
> On Thu, Jun 1, 2017 at 2:22 PM, Jonathan Lennox <jonathan at vidyo.com>
> wrote:
>
>>
>> On May 31, 2017, at 12:47 PM, Linfeng Zhang <linfengz at google.com> wrote:
>>
>> Hi,
>>
>> ./configure --build x86_64-unknown-linux-gnu --host arm-linux-gnueabihf
>> --disable-assertions --disable-check-asm --enable-intrinsics CFLAGS=-O3
>> --disable-shared
>>
>> When configuring with floating-point and intrinsics enabled as above, the
>> generated config.h only has OPUS_ARM_MAY_HAVE_NEON_INTR defined (to 1),
>> with
>> /* #undef OPUS_ARM_ASM */
>> /* #undef OPUS_ARM_INLINE_ASM */
>> /* #undef OPUS_ARM_INLINE_EDSP */
>> /* #undef OPUS_ARM_INLINE_MEDIA */
>> /* #undef OPUS_ARM_INLINE_NEON */
>> /* #undef OPUS_ARM_MAY_HAVE_EDSP */
>> /* #undef OPUS_ARM_MAY_HAVE_MEDIA */
>> /* #undef OPUS_ARM_MAY_HAVE_NEON */
>> /* #undef OPUS_ARM_PRESUME_AARCH64_NEON_INTR */
>> /* #undef OPUS_ARM_PRESUME_EDSP */
>> /* #undef OPUS_ARM_PRESUME_MEDIA */
>> /* #undef OPUS_ARM_PRESUME_NEON */
>> /* #undef OPUS_ARM_PRESUME_NEON_INTR */
>>
>> So MAY_HAVE_NEON will be defined to MEDIA version, which will eventually
>> fall down to C functions in the jump table:
>> # define MAY_HAVE_NEON(name) MAY_HAVE_MEDIA(name)
>>
>> Therefore all NEON intrinsics optimizations in their jump tables won't
>> get called for floating-point.
>>
>> Am I missing some options in my configure command, or the config is
>> intend to do so in floating-point?
>>
>> Thanks,
>> Linfeng
>>
>>
>> The structure of this is pretty tangled and confusing, but what you’ll
>> find is that the MAY_HAVE_NEON macro isn’t used in the jump tables for the
>> two Neon intrinsics functions (silk_NSQ_noise_shape_feedback_loop_neon
>> and celt_pitch_xcorr_float_neon) which are used in a floating-point neon
>> build. See silk/arm/arm_silk_map.c and celt/arm/arm_celt_map.c.
>>
>> So long as OPUS_ARM_MAY_HAVE_NEON_INTR and OPUS_HAVE_RTCD are set in
>> config.h, it’ll pick up those functions, and check for them using RTCD.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/opus/attachments/20170601/2be9ea3a/attachment-0001.html>
More information about the opus
mailing list