[opus] Alleged bug in Silk codec

KOEN VOS koenvos74 at gmail.com
Fri Jun 20 14:04:30 PDT 2014

Yes those instructions exist, although they're a bit slower than the basic
16x16->32 with 32-bit accumulation (SMLABB).  So I'd be surprised if the
function with 64 bit accumulation would run as fast as the current code.
Don't know how much we care about 16-bit platforms.  And accuracy should
not matter.

On the other hand, a 64-bit implementation is much cleaner/shorter, which
is always a good argument :-)
All in all, no strong preference from my side.

On Fri, Jun 20, 2014 at 12:40 PM, Timothy B. Terriberry <tterribe at xiph.org>

> KOEN VOS wrote:
>> current implementation is faster on a 32 or 16 bit processor.  If you
>> would find the opposite to be true (ie that a 64 bit implementation is
>> faster on, say, a 32 bit ARM CPU) then perhaps we should reconsider.
> Doesn't ARMv6 have a dual signed 16x16->32 multiply with a 64-bit
> accumulator (SMLALD)? Even v5E should have a single 16x16->32 with a 64-bit
> accumulator (SMLALBB). I would think a 64-bit version could be made pretty
> fast on 32-bit ARM, without even resorting to SIMD.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/opus/attachments/20140620/6fc87274/attachment.htm 

More information about the opus mailing list