[opus] [PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
Linfeng Zhang
linfengz at google.com
Wed Feb 15 21:05:30 UTC 2017
Hi Jean-Marc,
The original celt_fir() is a little bit messy. It has 2 branches chosen by
#ifdef SMALL_FOOTPRINT.
For floating-point, the 2 branches are identical (except the operation
sequence of accumulating x[i] to sum, which is not a big deal).
For fixed-point, the 2 branches are different. I separate them into 2
functions: the new celt_fir(), and celt_fir_permit_overflow() which is the
SMALL_FOOTPRINT branch.
The only difference for fixed-point is:
celt_fir(): the sum is truncated first and then accumulated to x[i] and
saturated.
celt_fir_permit_overflow(): x[i] is accumulated to the sum first and then
truncated saturated.
Maybe this is the reason why silk_LPC_analysis_filter() switched the FIR
from celt_fir() to celt_fir_permit_overflow() half a year ago.
Because of silk_LPC_analysis_filter(), celt_fir_permit_overflow() must
behave the same for both floating-point and fixed-point, and this is why we
defined ADD32_FIXED(), ..., PSHR32_FIXED() etc.
It's still a messy.
For the NEON optimization part, the previous celt_fir() optimization calls
xcorr_kernel(). We tested and found that calling the xcorr_kernel()
optimization didn't help too much here. The optimization in the patch is
about 1% faster than simply calling xcorr_kernel() for the whole encoder.
Considering the really small size of the new optimization, it's better to
not call xcorr_kernel() to get 1% faster.
Thanks,
Linfeng
On Wed, Feb 15, 2017 at 12:06 PM, Jean-Marc Valin <jmvalin at jmvalin.ca>
wrote:
> Hi Linfeng,
>
> Can you give me a bit more details about the purpose of this patchset.
> It seems to me like it's mostly duplicating the celt_fir()
> optimizations? Did I miss anything?
>
> Cheers,
>
> Jean-Marc
>
> On 15/02/17 02:22 PM, Linfeng Zhang wrote:
> > Hi,
> >
> > Attached are two patches. Patch 1 refactors silk_LPC_analysis_filter().
> > And Patch 2 optimizes the new function celt_fir_permit_overflow() for
> > ARM NEON.
> >
> > Please recommend a better function name.
> >
> > We did the same internal code review and testing already.
> >
> > Thanks,
> > Linfeng
> >
> >
> >
> > _______________________________________________
> > opus mailing list
> > opus at xiph.org
> > http://lists.xiph.org/mailman/listinfo/opus
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/opus/attachments/20170215/99a85dd6/attachment.html>
More information about the opus
mailing list