[opus] [PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
linfengz at google.com
Wed Mar 1 19:30:34 UTC 2017
Do you think it would be possible to improve the API of xcorr_kernel() so
> that calling it in a loop is more efficient?
If it could be inlined, it will be more efficient. Besides memory bouncing,
frequent function call is expensive.
The other advantage to wiring up xcorr_kernel() is that it applies in more
> places than your intrinsics-only celt_fir() implementation.
One solution is to put the outer for(N) loop inside xcorr_kernel() to let
it return N results instead of 4 (similar to the celt_fir() NEON intrinsics
did). This will make it efficient plus universal.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the opus