[Speex-dev] Blackfin inline assembler and VisualDSP++ toolchain
Jean-Marc Valin
jean-marc.valin at usherbrooke.ca
Thu Jun 14 14:16:37 PDT 2007
Michael Shatz a écrit :
>> Actually, you're the first I know using the VisualDSP++ toolchain
>> :-)
>
> I guess that's because speex has pretty big memory footprint.
Actually, you'll find that the data footprint in the lastest versions is
pretty small. There's a bit more code/tables, but you'll find that many
can go away if you're not actually using them.
> So
> developers that integrate speex tend to have plenty of RAM and once
> one has plenty of RAM he could install biggish OS. And between
> biggish OSes for Blackfin the most popular choice is uCLinux. And
> ucLinux works best with gnu tools. Something like that. On the other
> hand, developers that use Blakfin in a manner similar to traditional
> 16-bit DSP usage model, i.e. without external RAM or with relatively
> small internal SRAM normally use no OS at all (like me) or ADI's VDK.
> These people naturally prefer ADI toolchain because it gives you good
> visibility of what's going on within a small "bare metal" target. But
> such developers a less likely to integrate speex because it simply
> doesn't fit.
What do they use? I don't think Speex is really much more expensive than
other codecs when you compare apples to apples (e.g. if you compare with
g.729, then first disable anything that isn't used by the 8 kbps mode).
> I guess I am one of the few that try to run speex entirely from
> internal RAM and the fact already forced me to move from BF531 to
> BF533.
That's an interesting exercise indeed.
> Just about everything fails. Some things fail during compilation, the
> rest during final assembling pass. Thinking about it, the problem is
> probably not in the asm syntax, but in a way by each the compiler
> treats the asm keyword. For example, for the following function:
> -----
> static inline spx_word16_t MAX16(spx_word16_t a, spx_word16_t b)
> {
> spx_word32_t res;
> __asm__ (
> "%1 = %1.L (X);\n\t"
> "%2 = %2.L (X);\n\t"
> "%0 = MAX(%1,%2);"
> : "=d" (res)
> : "%d" (a), "d" (b)
> );
> return res;
> }
> ---
> Compiler says:
> "libspeex\fixed_bfin.h", line 48: cc1101: error: invalid constraint in asm statement
> : "%d" (a), "d" (b)
> ^
>
> Following modification successfully passed:
> : "d" (a), "d" (b)
>
> Similarly, compiler doesn't understand the following line:
> : "=m" (res)
> It claims that m is not valid constarin. Looking into the manual (including gnu manual) I agree with compiler.
BTW, gcc accepts these constraints fine. It's been too long so I don't
quite remember how all of that worked, though (IIRC, the % means "input
may share a register with output"). What happens if you make all the
changes to make it compile? Does it run fine. I don't have VisualDSP++,
so it's hard to help with exact constraints.
>> BTW, when you say it's slow, can you be more precise? What
>> performance do you expect and what do you get? Using gcc, I think
>> David got it down to ~20 MIPS at 15 kbps, so I assume VisualDSP++
>> should be able to do better than that.
>
> I talked to David. He got 22 MIPS _with_ inline asm. I am getting
> around 34 MIPS for exactly the same mode (15kbps, complexity=1,
> vbr=off) without inline asm. Don't know the scores for gcc, so can't
> tell whether ADI compiler is better. It's surely not better than your
> assembler.
IIRC, gcc alone (no asm) was using something in the order of 100 MIPS
(back when it couldn't do hardware loops, MACs, cond. moves, ...), so as
you can see, there's a fair bit of difference. So yes, with assembly
working, VDSP++ should be able to achieve better than 20 MIPS.
Jean-Marc
More information about the Speex-dev
mailing list