[Speex-dev] Minor fixed point scaling problem

Thu Jan 25 15:50:13 PST 2007

First, let me say that I think the speex code is incredible in the way it
supports floating and fixed point code from one set of code.  The same is
true for supporting multiple processors, etc...  I've used speex with the
PC, TI 64xx and 55xx.  Please view the following comments not as an attack
on speex (which I think is incredible!) but as my contribution to an open
source project.

I know this is a minor point, but speex/AEC fixed point could use a few
improvements shifting 2's complement values.

I realize these things are "almost" correct and I doubt if fixing these
problems will be audible.  But doing it correctly does help consistency when
the code is optimized and it makes it easier to understand why limits are
enforced within the code.

I have made these changes (and more) to my code, but I still don't
understand the process of getting these changes into and released from the
speex SVN.  So here are a few changes, I think should be made to the code.

I found these problems when dropping in the _norm() instruction available on
the TI DSP families. _norm() returns the number or redundant bits which lets
you figure out how many bits to shift.  You don't need to worry about the
sign with this function and the BlackFin has a corresponding instruction.
Using this instruction saved me about 10% of my total encoder instructions.

It would probably make sense to have a NORM macro that maps to _norm() on TI
platforms or whatever the processor specific instruction is available.

In filters.c normalize16() searches for the largest value using the
following loop:

   for (i=0;i<len;i++)
   {
      spx_sig_t tmp = x[i];
      if (tmp<0)
         tmp = NEG32(tmp);
      if (tmp >= max_val)
         max_val = tmp;
   }
It should really use tmp=NEG32(tmp+1), since the range of negative fixed
point has one more number for a given number of bits. (and that's what all
the shifting is about)  This same code fragment is in normalize16() and
compute_rms() and a couple of other spots.  I think it should be
encapsulated into a norm_shift() function that can be overridden for
specific processors.

Similarly, the AEC code uses FLOAT_ADD, _SUB, _MULT routines in
pseudofloat.h could be improved by shifting negative mantissa's >= -16384.

   if (r.m>0)
   {
      if (r.m<16384)
      {
         r.m<<=1;
         r.e-=1;
      }
   }else 
   {
      if (r.m>=-16384)
      {
         r.m<<=1;
         r.e-=1;
      }
   }
   return(r);

Here's my specific recommendation on norm_shift.
/* Copyright (C) 2006, 2007 Applied Signal Processing, Inc. */
/**
   @author Jerry J. Trantow
   @file norm_ti64xx.h
   @brief Added norm_shift function and optimized for C64xx.
*/
/*
   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions
   are met:

   - Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.

   - Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.

   - Neither the name of the Xiph.org Foundation nor the names of its
   contributors may be used to endorse or promote products derived from
   this software without specific prior written permission.

   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
   ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
   A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR
   CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
   EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
   PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
   PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
   LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
   NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
   SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#include <std.h>
#include <stdio.h>

#if (1)
    #define OVERRIDE_NORM_SHIFT
    #if (0)
        /*
            Two competing optimization canidates.
        */
	        #warn Using the TI optimized norm_shift() function using
_norm intrinsic.
        /*
            norm_shift() is used to find the shift required to scale a
vector of 32 bit 
            values to 16 bits.
            This optimized verison uses the TI NORM instruction rather then 
            max(abs(x)).  Other DSPs have corresponding instructions.
(BlackFin has SIGNBITS.)
        */    
        static int norm_shift(const spx_sig_t *x, spx_sig_t max_scale, int
len)
        {
            int sig_shift_ti=32;
	        int i;

	        #warn Using the optimized norm_shift() function.
            /*
    	        TI Specific.
            */
            #if (0)
                assert((len>=24)&&(len<=184));   // Check MUST_ITERATE()
range.
                assert(0 == (len&0x03));         // Can help compiler if
length is multiple of 4.
                _nassert((len>=24)&&(len<=184));   // Check MUST_ITERATE()
range.
                _nassert(0 == (len&0x03));         // Length must be
multiple of 4.
                #pragma MUST_ITERATE(24,184,4)
            #endif
            /*
                Directly find the min(_norm(x[i]) rather than searching for
max(abs(x[i])) and taking _norm.
            */
            for (i=0;i<len;i++)
            {
                sig_shift_ti=min(sig_shift_ti,_norm(x[i]));	// Largest
abs(values) will have smallest norm shift.
            }
            sig_shift_ti=max(0,_norm(max_scale-1)-sig_shift_ti);
            /*
                Return the shift value.
            */
            return(sig_shift_ti);
        }	//	norm_shift().	
    #else
        /*
	        Use the DSP NORM instruction and parallel min()
instructions.
        */
        static int norm_shift(const spx_sig_t *x, spx_sig_t max_scale, int
len)
        {
	        int n0; // 32 bit container for finding 4 parallel 8 bit
minimums.
	        int n1; // 32 bit container for finding 4 parallel 8 bit
minimums.
	        char c0;
            int *xptr;
            int i;

	        #warn Using the TI optimized norm_shift() function using
_norm and _minu4 intrinsics.
            /*
    	        TI Specific.
            */
            xptr=(int *)&x[0];
            n0=0x7F7F7F7F;  // Initialize to 4 max unsigned 8 bit values.
            #if (1)
                /*
                    These restrictions should apply which will let the
compiler
                    generate faster code.
                */
#if (0)                
                assert((len>=24)&&(len<=184));   // Check MUST_ITERATE()
range.                
                assert(0 == (len&0x03));         // Can help compiler if
length is multiple of 4.
#endif
                _nassert((len>=24)&&(len<=184));   // Check MUST_ITERATE()
range.
                _nassert(0 == (len&0x03));         // Length must be
multiple of 4.
                #pragma MUST_ITERATE(24,184,4)
            #else
                /*
                    Initialize the first set of values.
                    If length isn't a factor of 4 put a few elements in the
first value.
                */
                for (i=0;i<(len%4);i++)
                {
                    n0=(n0<<8)|_norm(*xptr++);
                }
            #endif
            /*
                Loop comparing min of 4 norms at a time.
            */
            for (i=0;i<(len/4);i++)
            {
                n1 =(_norm(*xptr++));
                n1|=(_norm(*xptr++)<<8);
                n1|=(_norm(*xptr++)<<16);
                n1|=(_norm(*xptr++)<<24);
                n0=_minu4(n0,n1);
            }
            /*
                Find the min norm of the 4 eight bit values in n0.
            */    
            c0=min((char)n0,(char)(n0>>8));
            c0=min(c0,(char)(n0>>16));
            c0=min(c0,(char)(n0>>24));
            /*
                Scale relative to norm of max_scale-1.
            */
            c0=max(0,(char)_norm(max_scale-1)-c0);
            /*
                Return the shift value.
            */
            return((int)c0);
        }	//	norm_shift().	
    #endif
#endif

#if (1)
	#define OVERRIDE_NORMALIZE16
	#warn Using optimized normalize16().
	/*
		Returns the number of shifts required to scale the 32 bit
signal down to max_scale.
	*/
	static int normalize16(const spx_sig_t *x, spx_word16_t *y,
spx_sig_t max_scale, int len)
	{
        int sig_shift_ti=32;
		int i;

#if (0)
		assert(16384 == max_scale);      // Only used for this
value.
		assert((len>=24)&&(len<=184));   // Check MUST_ITERATE()
range.
		assert(0 == (len&0x03));         // Length must be multiple
of 4.
//		assert(0 == x&0x03);
//		assert(0 == y&0x03);
#endif
		_nassert(16384 == max_scale);
		_nassert((len>=24)&&(len<=184));   // Check MUST_ITERATE()
range.
		_nassert(0 == (len&0x03));         // Length must be
multiple of 4.
//		_nassert(0 == x&0x03);
//		_nassert(0 == y&0x03);

		sig_shift_ti=norm_shift(x, max_scale,len);
        /*
	        Normalize the y[i] values.
        */
        #pragma MUST_ITERATE(24,184,4)
        for (i=0;i<len;i++)
        {
            y[i] = (spx_word16_t)(x[i]>>sig_shift_ti);
        }
        /*
            Return the shift value.
        */
        return(sig_shift_ti);
	}	//	normalize16().
#endif

-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 10488 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20070125/47a62de7/winmail.bin