Jean-Marc Valin Jean-Marc.Valin at USherbrooke.ca
Thu Jan 19 15:48:29 PST 2006

> So, I did following changes:
> In every loop I remove %= (In Vis.DSP Help I didn't find such sign)
> And loop become able to compile.

Let me know if you think of something that would work in both gcc and

> With parallel issues I found several interesting things:
> 1)
> "A0 -= R1.L*R0.H (IS), A1 -= R1.L*R0.L (M,IS) || R0 = [%1++];\n\t"
> //doesn't compile
> It can be compiled in that way:
> "A1 -= R1.L*R0.L (M), A0 -= R1.L*R0.H (IS) || R0 = [%1++];\n\t"
> or
> "A1 -= R1.L*R0.L, A0 -= R1.L*R0.H (IS) || R0 = [%1++];\n\t"
> //So, with (M,IS)-doesn't work, And for Vis.DSP - A1 operations must
> //be first. So, I think that without (M,IS) it will work wrong, so I
> //decide to separate it:
> "A0 -= R1.L*R0.H (IS);"
> "A1 -= R1.L*R0.L (M,IS) || R0 = [%1++];\n\t"

Strange that it's not allowed yet gcc doesn't complain. I guess it
wouldn't be the first time gnu as silently generates bad code. Still, I
haven't seen this to cause bad results. Where's that from?

> 2)//about Acc operations
> "A0 = A1 = 0;"//doesn't compile
> "A1 = A0 = 0;"//that's ok

I can fix that. Where?

> By the way, can you explain me why the next stroke wrong:
> "R1.L = R4.L*R2.L (IS) || R0 <<= 1;"
> //This is ok for 32||16||16
> //model, but don't compiles

Strange. Still no clue why gnu as is OK and not VDSP. Where's that?

> 3)About Pregs(I wrote in previous letter about Pointer registers) - I think that VDSP compiler
> are using different Pregs in C/C++ routines and when big asm inline
> routine happens(that uses a lot of Pregs) - pointer registers are busy
> by another values. So, to prevent data damage compiling fails.
> Because it happens only with big inline routines, that uses a lot of
> Pregs(5 of 6).

OK, I can probably fix that to reduce the requirement.

> How performance changes when you are switching from floating point to
> fixed point with using gcc? And to Blackfin asm?

float->fixed : at least 10x faster (but probably more like 100x)
fixed->asm: Almost 2x faster with gcc, but probably less difference with

> When I am using BF537 Ezkit Lite(600MHz) - I am compressing sound with speex
> (8 KHz, 16 bit, one channel).
> To compress 1 second of sound it takes:
> 1)floating point - 8 seconds
> 2)fixed point - about 4 seconds.

Definitely not normal. Using quality 8 (15 kbps) at 8 kHz and complexity
2 (default) takes about 15% on a 500 MHz Blackfin 533 (STAMP). What are
you testing with? testenc?

> Do you intend to make any additions, changes in speex library with
> using Blackfin?

What do you mean?


