[Speex-dev] cpu utilization across speex versions

Tom Grandgent tgrand at canvaslink.com
Sun May 29 09:36:53 PDT 2005


Kemal,

It sounds like you are doing something wrong.  I strongly recommend that you 
profile your application to see exactly where the CPU time is being spent.  
AMD happens to have a nice profiler called CodeAnalyst that they give away 
for free.  And it's plenty usable on Intel CPUs as well.

http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_3604,00.html

Make sure you test a release build of your app with symbols (the CodeAnalyst 
help will tell you how to do this if you're not sure), and make sure you're 
talking into the mic while profiling because Speex uses significantly more 
CPU when speech is present.

I've been profiling my application all weekend, although not for Speex, 
but I can share some results just testing the Speex part of it.  My system 
is a 2.8GHz P4 with HT enabled.  I use floating point Speex, no SSE, 
compiled with VC++ 6.0.  The following tests were done with wideband (16kHz), 
VBR quality 6, complexity 0 (doesn't seem to affect quality noticibly), 
encoding and decoding one stream in real-time.  When I used the preprocessor 
I used the noise filter and AGC only.  I tested with Speex 1.0.1 and 1.0.7.  
The overall #'s represent CPU usage for my entire process.  I estimate CPU 
usage not due to Speex at only about 0.02% at most.  And no, I wasn't trying 
to optimize this 0.02%! :)  Anyway, here are the results with just the top 
functions listed.  Note that Speex distributes its CPU load across many 
functions...

Speex 1.0.1:

With preprocessor: (overall 2.51%)
0.43% split_cb_search_shape_sign
0.26% speex_preprocess
0.26% fir_mem_up
0.23% filter_mem2
0.23% vq_nbest
0.18% open_loop_nbest_pitch
0.12% qmf_decomp

No preprocessor: (overall 2.12%)
0.43% split_cb_search_shape_sign
0.23% fir_mem_up
0.22% filter_mem2
0.21% vq_nbest
0.18% open_loop_nbest_pitch
0.11% qmf_decomp

Speex 1.0.7:

With preprocessor: (overall 2.33%)
0.28% speex_preprocess
0.25% vq_nbest
0.23% split_cb_search_shape_sign_N1
0.21% fir_mem_up
0.18% open_loop_nbest_pitch
0.18% filter_mem2
0.12% split_cb_search_shape_sign
0.11% qmf_decomp

No preprocessor: (overall 1.95%)
0.26% vq_nbest
0.23% split_cb_search_shape_sign_N1
0.21% fir_mem_up
0.20% open_loop_nbest_pitch
0.16% filter_mem2
0.13% split_cb_search_shape_sign
0.12% qmf_decomp

No preprocessor, complexity 6: (overall 3.52%)
1.08% vq_nbest
0.63% split_cb_search_shape_sign
0.24% fir_mem_up
0.20% split_cb_search_shape_sign_N1
0.20% filter_mem2
0.19% pitch_search_3tap
0.19% open_loop_nbest_pitch
0.13% qmf_decomp

As you can see, there was an improvement for me going from 1.0.1 to 1.0.7.  
And the complexity makes a difference in CPU usage, but no noticible 
difference in quality or bandwidth consumption.

Other than profiling your app, I can suggest using WB instead of UWB.  Is 
the 32kHz really buying you anything over 16kHz?  For speech it seems like 
a waste of time to me.  Note: I would never consider using NB (8kHz) for 
this type of application.  Also, although I have not profiled it, I have 
done single-encode multiple-decode sessions with up to 7 people and CPU 
usage has never been a problem.  I've never had a need for multiple 
encode, but given these results I don't think it should be that bad.

Hope this helps,

Tom

Kemal Oral CANSIZLAR <cansizla at usc.edu> wrote:
> 
> 
> Hi all,
> 
> We are using Speex codec for our P2P Audio conferencing application. Thus, a node may serve multiple neighbors, requiring multiple encode/decode at a time.
> 
> I have observed that Speex consumes alot of CPU power; one machine is Pentium 3 800MHz, 256MB RAM with XP. When this PC serves the other two faster PC's, CPU utilization soars to maximum. I have tried various versions of speex; 1.0.1, 1.0.4, 1.0.5, and interesting enough, I found that 1.0.1 did a better job, keeping CPU utilization at acceptable levels, unlike 1.0.4 & 1.0.5. Is this behaviour normal? Moreover, does anyone have any figures for minimum system requirements for speex coding in real-time for multiple parties? Or might I be doing something wrong/inefficient?
> 
> By the way, the audio settings are: 32KHz uwband mode, 16 bits, Mono, speex quality 8, speex encoder complexity 4 (almost same results with complexity 2 or 1).
> 
> Any pointers greatly appreciated, thank you.
> 
> -Kemal.



More information about the Speex-dev mailing list