[Speex-dev] Resampler saturation, blackfin performance
stephane.lesage at ateis-international.com
Sun Jun 14 14:31:01 PDT 2009
> -----Message d'origine-----
> De : Jean-Marc Valin [mailto:jean-marc.valin at usherbrooke.ca]
> Envoyé : dimanche, 14. juin 2009 20:46
> À : Stephane Lesage
> Cc : speex-dev at xiph.org
> Objet : Re: [Speex-dev] Resampler saturation
> Just to make sure I understand, the two patches you sent are
> two different ways to fix the problem, with the only
> difference being that resample.patch converts the "unrolled
> by four" loop into a plain one that's easier on DSPs, right?
Yes exactly, plus a little explanation in comments.
I really have no idea of the performance difference on x86. But I think gcc/msvc can unroll.
Up to you. Anyway I can OVERRIDE_INNER_PRODUCT_SINGLE.
Talking about performance (still using generic version with VDSP compiler):
1. I got a pretty good boost by using a scratch buffer in SRAM.
2. Wideband Encode+Decode takes 79.1 + 7.2 MIPS on my BF536 400/133 Mhz
3. Profiler says:
4. I'm using the echo-canceller + preprocessor,
I'd really like to improve performance here:
- I would like to use ADI's FFT, but it's limited to powers of 2,
is it safe to enable "Round ps_size down to the nearest power of two" in the preproc ?
can we do the same trick with the echo-canceller for window_size ?
- are there buffers who could be placed in scratch memory ?
(I don't see any speex_scratch_alloc inthere)
More information about the Speex-dev