[opus] Opus performance on Cortex-M4
Basil Mohamed Gohar
basilgohar at librevideo.org
Tue Nov 4 19:25:02 PST 2014
On 11/03/2014 08:32 PM, Jean-Marc Valin wrote:
> Hi Andy,
> On 03/11/14 07:36 PM, Andy Isaacson wrote:
> > In some quick testing on Cortex-A8 (a very different core, but at least
> > ISA compatible and hopefully fairly similar to M4 for things like cycle
> > counts and code size) I saw promising results -- about 30 MHz of A8 CPU
> > was sufficient to encode an audio stream using the 1.1.1-beta fixed
> > point codec at 48 kHz mono, complexity=5, bitrate=20kbit/sec.
> First, I think the big difference between the M4 and the A8 is that A8
> has Neon, which Opus is able to use.
> > However now that we're doing a first implementation on M4, we're seeing
> > significantly higher cycle counts -- more in the range of 100 MHz of CPU
> > needed to encode with the same parameters. Additionally, compared to
> > 1.0.3, the code size and data size of the Opus codec in 1.1 has
> > increased significantly (which makes it a challenge to fit in the on-SoC
> > SRAM of the M4).
> I suspect most of the size increase you're seeing is from the new code
> in src/analysis.c which you do not need. In fact, if you're operating at
> 20 kb/s for speech, then you can entirely remove the CELT encoder from
> your build. You still need the decoder because there's no guarantee what
> the remote end will send you.
> > Obviously we need to use the ARM ASM that landed in -beta, and we can
> > decrease the complexity to somewhat reduce the CPU utilization, but I'm
> > wondering if I'm missing any other low-hanging fruit in optimizing Opus
> > for this target CPU. I haven't even started to do code profiling or CPU
> > performance counter analysis.
> There's a few things to check. First, make sure that
> OPUS_ARM_INLINE_EDSP (enabling DSP extensions) is defined in your
> config.h. Also, check for OPUS_ARM_ASM and OPUS_HAVE_RTCD. That means
> all the asm is enabled. At that point, the best is to run the profiles
> to see where the CPU time is spent.
> opus mailing list
> opus at xiph.org
Incidentally, I think this advice constitutes a part of a bigger "Opus for embedded systems" guide, where you can define use cases and implementation optimizations that might be useful.
Andy, you may consider posting what you and/or your company are willing to share to the Xiph wiki, for example.
More information about the opus