[opus] [Aarch64 00/11] Patches to enable Aarch64
John Ridges
jridges at masque.com
Thu Nov 19 14:47:38 PST 2015
Any speedup from the intrinsics may just be swamped by the rest of the
encode/decode process. But I think you really want SIG2WORD16 to be
(vqmovns_s32(PSHR32((x), SIG_SHIFT)))
On 11/19/2015 2:52 PM, Jonathan Lennox wrote:
>> On Nov 16, 2015, at 4:42 PM, Jonathan Lennox <jonathan at vidyo.com> wrote:
>>
>> I haven’t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That’s an obvious next step.
> This doesn’t show any appreciable speed difference in my tests, but the code is obviously better by inspection (all three of these map directly to a single Aarch64 instruction and a single Neon intrinsic) so my code paths may just not exercise them.
>
> Patches follow.
>
More information about the opus
mailing list