[opus] Alleged bug in Silk codec

Fri Jun 20 14:04:30 PDT 2014

Yes those instructions exist, although they're a bit slower than the basic
16x16->32 with 32-bit accumulation (SMLABB).  So I'd be surprised if the
function with 64 bit accumulation would run as fast as the current code.
Don't know how much we care about 16-bit platforms.  And accuracy should
not matter.

On the other hand, a 64-bit implementation is much cleaner/shorter, which
is always a good argument :-)
All in all, no strong preference from my side.
koen.

On Fri, Jun 20, 2014 at 12:40 PM, Timothy B. Terriberry <tterribe at xiph.org>
wrote:

> KOEN VOS wrote:
>
>> current implementation is faster on a 32 or 16 bit processor.  If you
>> would find the opposite to be true (ie that a 64 bit implementation is
>> faster on, say, a 32 bit ARM CPU) then perhaps we should reconsider.
>>
>
> Doesn't ARMv6 have a dual signed 16x16->32 multiply with a 64-bit
> accumulator (SMLALD)? Even v5E should have a single 16x16->32 with a 64-bit
> accumulator (SMLALBB). I would think a 64-bit version could be made pretty
> fast on 32-bit ARM, without even resorting to SIMD.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/opus/attachments/20140620/6fc87274/attachment.htm