<div>Ok. Will look out for alloca().<br></div><div>cheers</div><div class="gmail_quote"></div><div class="gmail_quote"><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">If you define _USE_SSE, CELT is already able to use SSE instructions. You must<br>
be careful on Windows though because it doesn't support C99 var-arrays and<br>
alloca() isn't properly aligned for SSE, so you have to make sure that alloca()<br>
isn't used.<br>
<div class="im"><br>
> 2) To maybe learn from someone with previous experience in optimizing Speex<br>
> for moderns x86 architectures before I set off trying all kinds of things on<br>
> my own.<br>
<br>
</div>Can't think of anything else you need to know.<br>
<font color="#888888"><br>
Jean-Marc<br>
</font><div><div class="h5"><br>
><br>
> See answers inline:<br>
> 2009/6/15 Tom Grandgent <<a href="mailto:tom@grandgent.com">tom@grandgent.com</a>><br>
><br>
> > Why haven't you tried using release build with compiler optimizations?<br>
><br>
> I just haven't started with optimizing... yet.<br>
><br>
><br>
> ><br>
> > It's quite possible that the performance picture could be substantially<br>
> > different. You might end up wasting a lot of time if you do much<br>
> > performance analysis or optimization on a debug build.<br>
><br>
> Yes, you are right and that is not what I'm doing.<br>
><br>
><br>
> > Debug build<br>
> > not only has no optimization - it also has extra checks that may have a<br>
> > significant performance impact depending on the code.<br>
><br>
> Possibly, but I have identified the most expensive functions. They are all<br>
> from the Speex dll. I believe these will remain the most interesting ones<br>
> also in release build with some O-flags.<br>
> Someone who is knowledgeable in these functions might know if their impact<br>
> can be reduced and what the best practices are.<br>
><br>
> ><br>
> ><br>
> > If you want to profile with symbols, you know you can compile a release<br>
> > build with symbols, right? The CodeAnalyst documentation describes<br>
> > how to do that with Visual Studio. (I've done it.)<br>
><br>
> Ok. thx.<br>
><br>
> ><br>
> ><br>
> > Tom<br>
> ><br>
> > Greger Burman <<a href="mailto:greger@mobile-robotics.com">greger@mobile-robotics.com</a>> wrote:<br>
> > ><br>
> > > I have a question about the overall performance of Speex and what I can<br>
> > do<br>
> > > to improve it. I'm running Speex Windows x86, Visual C++ EE compiler. I<br>
> > will<br>
> > > say right away that I've only compiled debug so far and used no compiler<br>
> > > optimizations at all.<br>
> > > I use the uwb-mode, preprocessing, denoising and echo cancellation.<br>
> > > I've noticed that speex consumes a lot of cpu resources. When I run this<br>
> > on<br>
> > > a Celeron 2,6GHz I have to disable EC in order to not overload the cpu.<br>
> > Am I<br>
> > > correct to assume that there are massive floating point calculations<br>
> > > happening?<br>
> > > I did a quick profile with CodeAnalyst and identified the most expensive<br>
> > > functions as (in order):<br>
> > > CPU Clocks, Function<br>
> > > 4657, kiss_fft_stride<br>
> > > 4456, speex_echo_cancellation<br>
> > > 2494, split_cb_search_shape_sign<br>
> > > 1490, fir_mem16<br>
> > > 1419, speex_preprocess_run<br>
> > > I'm looking for advise on how to boost the performance with as little<br>
> > code<br>
> > > rewrite as possible. The architecture for release build will be SSE/SSE2<br>
> > > capable.<br>
> > > 1) Compiler optimizations: Recommended options?<br>
> > > 2) SIMD. Is Speex written to take advantage of SIMD architectures? What<br>
> > must<br>
> > > I do to take advantage of this?<br>
> > > --<br>
> > > Greger Burman<br>
> ><br>
> ><br>
><br>
<br>
<br>
<br></div></div></blockquote></div>