[opus] Bug fix in celt_lpc.c and some xcorr_kernel optimizations

Aurélien Zanelli aurelien.zanelli at parrot.com
Tue Jun 11 00:31:31 PDT 2013


I compared C version, John's versions and azanelli's version.

I encoded a music file of 247 seconds at an average bitrate of 100kbps 
on a Cortex-A8. Results are:
- With xcorr_kernel_c(): 26.45s to encode
- With xcorr_kernel_neon_john1(): 24.86s to encode (~6%)
- With xcorr_kernel_neon_john2(): 24.4s to encode (~7.5%)
- With xcorr_kernel_neon_azanelli(): 24.15s to encode (~8.7%)
These functions have been inlined in pitch_xcorr(), celt_fir() and cel_iir.

Furthermore, the funny thing is that an indirect call to 
xcorr_kernel_azanelli is faster: 23.75s (~10%). However, I didin't test 
the others.

Also i fixed my assembly version to avoid read past "y" buffer and fix 
register garbage when it's inlined.

Best regards,

P.S: I made a mistake so some of my e-mails have not been sent to this 
mailing list. I apologize for this.

Aurélien Zanelli
Parrot SA
174, quai de Jemmapes
75010 Paris

More information about the opus mailing list