[opus] Bug fix in celt_lpc.c and some xcorr_kernel optimizations
Aurélien Zanelli
aurelien.zanelli at parrot.com
Tue Jun 11 00:31:31 PDT 2013
Hi,
I compared C version, John's versions and azanelli's version.
I encoded a music file of 247 seconds at an average bitrate of 100kbps
on a Cortex-A8. Results are:
- With xcorr_kernel_c(): 26.45s to encode
- With xcorr_kernel_neon_john1(): 24.86s to encode (~6%)
- With xcorr_kernel_neon_john2(): 24.4s to encode (~7.5%)
- With xcorr_kernel_neon_azanelli(): 24.15s to encode (~8.7%)
These functions have been inlined in pitch_xcorr(), celt_fir() and cel_iir.
Furthermore, the funny thing is that an indirect call to
xcorr_kernel_azanelli is faster: 23.75s (~10%). However, I didin't test
the others.
Also i fixed my assembly version to avoid read past "y" buffer and fix
register garbage when it's inlined.
Best regards,
P.S: I made a mistake so some of my e-mails have not been sent to this
mailing list. I apologize for this.
--
Aurélien Zanelli
Parrot SA
174, quai de Jemmapes
75010 Paris
France
More information about the opus
mailing list