[opus] [PATCH v1] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics

Timothy B. Terriberry tterribe at xiph.org
Fri Dec 19 20:49:12 PST 2014


Viswanath Puttagunta wrote:
> After analyzing celt_pitch_xcorr_arm.s, the closest I came using intrinsics
> is below code.. which didn't really put much dent in the performance..
> so I just left
> it out since above code submitted is much simpler to read than below
> celt_pitch_xcorr_arm.s.. So, I request to leave it simple to read for now.

I agree. I went ahead and tested your previous patch on a Cortex A8 with 
a while loop in place of the first three cases in the switch statement, 
and got:

~/bin/bench.sh ./opus_demo audio 48000 2 96000 ../../comp48-stereo.sw 
/dev/null

Original patch:
61613 ms (stddev 108) [61420 61434 {61449 61564 61643 61695 61713} 61762 
61971]

While loop:
61426 ms (stddev 52) [61307 61342 {61377 61379 61412 61477 61484} 61516 
61580]

So, not very much difference (0.3%), but a consistent improvement by 
using the while loop.


More information about the opus mailing list