[Speex-dev] Major internal changes, TI DSP build change

Jean-Marc Valin Jean-Marc.Valin at USherbrooke.ca
Sat Apr 22 06:55:40 PDT 2006

> >I fixed it in svn. Could you check that?
> Now all platforms match again.  Note that the measured SNR for this test 
> sample is lower than with the broken code (10.87 vs 11.10), but of course 
> this is no way to judge the real quality.

SNR, especially on a single sample, can be very misleading. Yet, could
you just check that the DSP results match what you get on a PC?

> >Does the C55 have a 32x16 multiplier or do you mean it handles my
> >emulation of it well?
> I has two ALUs with 17x17 bit MACs, and it has an instruction that does 
> this:
> ACy = M40(rnd((ACx >> #16) + (uns(Xmem) * uns(Ymem))))
> I never quite understood this, so I went of and looked at the manuals.  It 
> can multiply the low half in one cycle, then shift and add it to the high 
> half in a second cycle.  And, in a type loop the parallel ALUs would allow 
> one 32x16 multiply per cycle.

Just one thing I'd like to understand. Did you do some tricks and/or
assembly to implement the MULT16_32_Q* routines with these instructions
or does the compiler figure them out by itself?

> The C54x cannot do this, and uses library calls for 32x16 multiplies.  

Why is that? By default all the 32x16 multiplies are computed using only
16x16 multiplies (see fixed_generic.h).

> The 
> changes that you have made since 1.1.8 are most dramatic for the 54x, which 
> dropped from 184 (unusable in real time, the fastest parts are 160 MHz) to 
> 79 MIPs.  The C55x dropped from 41.5 to 29.4 MIPs (mixed 16/32 bit 
> capability), and the C6x dropped slightly from 36 to 34.5 MIPs (32bit 
> machine).

Glad it makes such a difference. I'm just surprised that the C6x
complexity is that high.


More information about the Speex-dev mailing list