# [Speex-dev] Speex inner_prod(), normalize, C64 MIPS

Jerry Trantow jtrantow at ieee.org
Sat Feb 4 09:38:06 PST 2006

```Ok, I hadn't verified inner product was called with values scaled to <=
+-16384.  That would make it safe to do a 32 bit add of the intermediate
terms. I have implemented the 40-bit accumulator.

> by the shift.  I also see a FIXED_POINT danger with the summation of four
> mults overflowing the 32 bit before the shift.
>
> I can fix this by accumulating each term into a long, but if the code
scales
> the x[],y[] vectors to avoid this problem I could use parallel 16x16

What do you mean here?

The C64x has a _dotp2() instruction that does two 16x16 multiplies and adds
the products together.  Since the values are scaled to 16384, I can add the
results of the two _dotp2()s together before the long add without worrying
about overflow.  I didn't understand that inner_prod() was always passed
scaled vectors.  That's the danger of optimizing routines without knowing
how they are called.

I split a norm_shift() out of your normalize16().  This function can also be
used twice in pitch_gain_search_3tap().  Are there any other places that
would benefit from this optimized routine?

/*
Returns number of shifts to normalize a 32 bit vector to
[-16384,+16384).
*/
static inline int norm_shift(const spx_sig_t *x, spx_sig_t max_scale, int
len)
{
int sig_shift_ti;
int i;

#warn Using the optimized normalize16() function.
/*
Directly find the min(_norm(x[i]) rather than searching for
max(abs(x[i])) and taking _norm.
*/
#pragma MUST_ITERATE(24,184,4)
for (i=0;i<len;i++)
{
sig_shift_ti=min(sig_shift_ti,_norm(x[i]));
}
sig_shift_ti=max(0,_norm(max_scale-1)-sig_shift_ti);
/*
Return the shift value.
*/
return(sig_shift_ti);
}	//	norm_shift().

PS.  Here are the C64x MIPS vs Complexity results for the original code.  I
have been able to reduce the complexity 1 encoder to 15.7 MIPS.

Encoder
Complexity	Original 32	Original 16
1	31.2	29.6
2	41.7	39.8
3	51.4	49.0
4	61.6
5
6
7		93.1
8
9		120.8

Jerry J. Trantow
Applied Signal Processing, Inc.
jtrantow at ieee.org

```