[opus] Alleged bug in Silk codec
koenvos74 at gmail.com
Fri Jun 20 14:04:30 PDT 2014
Yes those instructions exist, although they're a bit slower than the basic
16x16->32 with 32-bit accumulation (SMLABB). So I'd be surprised if the
function with 64 bit accumulation would run as fast as the current code.
Don't know how much we care about 16-bit platforms. And accuracy should
On the other hand, a 64-bit implementation is much cleaner/shorter, which
is always a good argument :-)
All in all, no strong preference from my side.
On Fri, Jun 20, 2014 at 12:40 PM, Timothy B. Terriberry <tterribe at xiph.org>
> KOEN VOS wrote:
>> current implementation is faster on a 32 or 16 bit processor. If you
>> would find the opposite to be true (ie that a 64 bit implementation is
>> faster on, say, a 32 bit ARM CPU) then perhaps we should reconsider.
> Doesn't ARMv6 have a dual signed 16x16->32 multiply with a 64-bit
> accumulator (SMLALD)? Even v5E should have a single 16x16->32 with a 64-bit
> accumulator (SMLALBB). I would think a 64-bit version could be made pretty
> fast on 32-bit ARM, without even resorting to SIMD.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the opus