[Speex-dev] cheb_poly_eva using Clenshaw's recurrence formula

Wed Nov 30 01:37:01 PST 2005

Hi,

A little bit more about this idea. I run speex on a Mips architecture.
It works fine but it uses about ~50% of the processor time, so with a
few friends we ran a profiling with gprof and we saw :

****************************
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name

/* abridged */

  2.07     11.56     0.30   275478     0.00     0.00  cheb_poly_eva
****************************

Decision was made to try to optimize each routine which needs more
than 2%, and so I had a look at cheb_poly_eva.

With my version, cheb_poly_eva takes about 1%, so it's not so bad,
especially for such architectures, which are not very suitable for
DSP.

Here are some outputs of an experiment made with the original function
and mine :

Input data : (taken form a real encoding phase)
****************************
float x[10]={2, 1.96, 1.98, 1.97, 1.965, 1.9625, 1.963750, 1.964375,
1.964688, 1.964844};

float coef[10][10]={
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 1.000000, 1.000000, 1.000000},
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 0.980000, 0.920800, 0.824768},
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 0.990000, 0.960200, 0.911196},
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 0.985000, 0.940450, 0.867687},
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 0.982500, 0.930613, 0.846154},
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 0.981250, 0.925703, 0.835443},
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 0.981875, 0.928157, 0.840794},
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 0.982188, 0.929385, 0.843472},
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 0.982344, 0.929999, 0.844813},
{2.000000, -3.050047, 3.537572, -3.632457, 3.250766, -1.758445,
1.000000, 0.982422, 0.930305, 0.845483}
};
****************************

Output data :

Original - New
*****************************
499.438202 - 499.438232
440.702759 - 440.702667
469.360992 - 469.360962
454.857544 - 454.857544
447.736938 - 447.737000
444.208984 - 444.209015
445.970245 - 445.970306
446.852966 - 446.852905
447.295319 - 447.295319
447.516083 - 447.516052
*****************************

As you see, there's a little difference in the function outputs, but
the final files are bit exact with the original encoded file (I don't
know really why). In some books, they say that Clenshaw scheme is more
accurate than the original iterated version, but I'm not sure it's
true... I'd like to ask a numerical analysis specialist...

Concerning the fixed point version, I'm working on it at the moment.
I'll post it when it'll work good.

Thanks a lot for considering this email so quickly !

Matthieu

PS: Björn, your handouts of the course "Mathematik für Informatiker"
on your web site are fine. In "Mathematik für Informatiker IV", your
professor pointed out the Clenshaw formula (pp. 12).

On 11/29/05, Jean-Marc Valin <Jean-Marc.Valin at usherbrooke.ca> wrote:
> > > I don't really know if it's a little improvement or not, so I'd like
> > > to have feedback from the Speex experts here.
>
> I'd have to test, but it looks right. I don't expect much improvement,
> but it makes the code simpler, especially for memory. Would also have to
> test on fixed-point and see if it works.
>
> > Maybe a new implementation should use the algorithm above plus the
> > explicit ALLOC which is used all over speex (why actually?).
>
> Actually, there would be no need for the ALLOC, which is a good thing.
> I'm using ALLOC only for dynamic allocation of temporary variables
> (stack) and because not all compilers support C99 arrays.
>
>         Jean-Marc
>