[Speex-dev] Resampler saturation, blackfin performance

Stephane Lesage stephane.lesage at ateis-international.com
Sun Jun 14 14:31:01 PDT 2009


> -----Message d'origine-----
> De : Jean-Marc Valin [mailto:jean-marc.valin at usherbrooke.ca] 
> Envoyé : dimanche, 14. juin 2009 20:46
> À : Stephane Lesage
> Cc : speex-dev at xiph.org
> Objet : Re: [Speex-dev] Resampler saturation
> Just to make sure I understand, the two patches you sent are 
> two different ways to fix the problem, with the only 
> difference being that resample.patch converts the "unrolled 
> by four" loop into a plain one that's easier on DSPs, right?

Yes exactly, plus a little explanation in comments.
I really have no idea of the performance difference on x86. But I think gcc/msvc can unroll.

Talking about performance (still using generic version with VDSP compiler):
1. I got a pretty good boost by using a scratch buffer in SRAM.
2. Wideband Encode+Decode takes 79.1 + 7.2 MIPS on my BF536 400/133 Mhz
3. Profiler says:
vq_nbest                  33.05%
vq_nbest_sign             11.12%
filter_mem16               4.14%
inner_prod                 4.07%
iir_mem16                  2.75%
qmf_synth                  2.32%
lsp_to_lpc                 2.32%
open_loop_nbest_pitch      1.41%
compute_impulse_response   1.37%
qmf_decomp                 1.28%
lpc_to_lsp                 1.26%
fir_mem16                  1.16%
speex_bits_pack            1.07%
speex_bits_unpack_unsigned 0.86%
compute_rms16              0.79%

4. I'm using the echo-canceller + preprocessor,
I'd really like to improve performance here:
- I would like to use ADI's FFT, but it's limited to powers of 2,
is it safe to enable "Round ps_size down to the nearest power of two"  in the preproc ?
can we do the same trick with the echo-canceller for window_size ?
- are there buffers who could be placed in scratch memory ?
(I don't see any speex_scratch_alloc inthere)

Stéphane LESAGE
ATEIS International

More information about the Speex-dev mailing list