[opus] Patch cleaning up Opus x86 intrinsics configury

Viswanath Puttagunta viswanath.puttagunta at linaro.org
Sat Mar 7 10:30:09 PST 2015


Hello Jonathan,

Just FYI, I started doing review of your patch and will get back to
you in few days. After review, I would like to rebase your patch (as
necessary) myself and do some testing.. and re-submit.

Regards,
Vish

On 4 March 2015 at 09:00, Viswanath Puttagunta
<viswanath.puttagunta at linaro.org> wrote:
>
> On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at vidyo.com> wrote:
> >
> > On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta
> > <viswanath.puttagunta at linaro.org> wrote:
> >
> >
> >
> > On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com> wrote:
> >>
> >> Viswenath,
> >>
> >> My patch should be against the tip, but it’s the very recent tip,
> >> including some changes this past Friday (27 Feb).  I mentioned in the IRC
> >> room a problem I discovered in creating my patch, and then later improved
> >> the fix Tim had made for the problem.  Where do you get conflicts merging it
> >> to tip?
> >>
> >> In terms of merging, you posted your patch before I posted mine, so
> >> probably I should be the one on the hook to rebase after your fix goes in.
> >> Looking over your patch quickly, I don’t think any of my changes should be
> >> that difficult to merge with yours.
> >
> >>> Yeah.. trivial merge issues.. shouldn't take too long to resolve.
> >>
> >>
> >> I haven’t studied your patch in depth yet.  Does Ne10 use its own RTCD
> >> code, or do you use Opus’s?
> >
> >>> Not using Ne10's RTCD feature at the moment.. using Opus's rtcd. libopus
> >>> RTCD for linux/armv7 will not work for aarch64. It needs to be updated (or
> >>> just disabled if system built for aarch64 since Neon support is
> >>> mandatory)... It is in my todo list..
> >
> >
> > My patch should handle aarch64 properly.  One of the things it does is it
> > discovers (in configure) whether a little test program using Neon intrinsics
> > can be compiled without -mfpu=neon — if so, it disables RTCD for Neon, and
> > sets a #define OPUS_ARM_PRESUME_NEON_INTR.  (This is analogous to the
> > existing code for determining whether Neon assembly instructions can be
> > assembled by default.)
> >
> > This thus handles not only aarch64, but also iOS armv7.
> This sounds good to me. I wanted to do more thorough code review for
> your patch, but am waiting until it applies cleanly on tip.. But now
> that we agree that your patch should be rebased on top of my patch,
> I'll either wait for Timothy/opus-dev to take in my patch, and your
> patch to be rebased? And if this is taking too long for my patch to go
> in (Due to opus-dev maintainers time availability), I'll rebase
> (retaining your authorship ofcourse) your patch on my submitted
> patch... and put in my future work on top of your patch.
>
> But I really hope we can get some cycles from Timothy/Jean to get the
> patches into opus. It seems they are really busy for last month or so.
>
> >
> > I also have a number of aarch64 patches pending — mine are entirely for
> > fixed-point though.  Are yours all floating-point?
> Yes, so far they are floating point. Can you post your fixed-point
> optimizations? Also, are any of your patches using NE10 library?
>
> So far, what I submitted is to optimize opus_fft using NE10 (float
> only). I also modified mdct_forward to take advantage of this
> optimization.
>
> My next action plan:
> - use opus_ifft to optimize mdct_backwards (float via Ne10)
> - Re-work/add opus_fft and opus_ifft for fixed (again via Ne10)
> - Enable for Aarch64 and test (I'm hoping your patch already does this)
> - float-to-int and int-to-float conversions using neon
> - etc.
>
>
> >
> >> On Mar 3, 2015, at 5:18 PM, Viswanath Puttagunta
> >> <viswanath.puttagunta at linaro.org> wrote:
> >>
> >> > Hello Jonathan,
> >> >
> >> > I am unable to apply your patch cleanly on tip.
> >> >
> >> > Timothy/opus-dev,
> >> >
> >> > This patch has some conflicts with my ARM patch that does fft
> >> > optimizations
> >> > http://lists.xiph.org/pipermail/opus/2015-March/002904.html
> >> > http://lists.xiph.org/pipermail/opus/2015-March/002905.html
> >> >
> >> > One of us probably has to rebase depending on which patch goes into opus
> >> > first.
> >> >
> >> > Regards,
> >> > Vish
> >> >
> >> >
> >> > On 1 March 2015 at 20:47, Jonathan Lennox <jonathan at vidyo.com> wrote:
> >> >> The attached patch cleans up Opus's x86 intrinsics configury.
> >> >>
> >> >> It:
> >> >> * Makes —enable-intrinsics work with clang and other non-GCC compilers
> >> >> * Enables RTCD for the floating-point-mode SSE code in Celt.
> >> >> * Disables use of RTCD in cases where the compiler targets an
> >> >> instruction
> >> >> set by default.
> >> >> * Enables the SSE4.1 Silk optimizations that apply to the common parts
> >> >> of
> >> >> Silk when Opus is built in floating-point mode, not just in fixed-point
> >> >> mode.
> >> >> * Enables the SSE intrinsics (with RTCD when appropriate) in the Win32
> >> >> build.
> >> >> * Fixes a case where GCC would compile SSE2 code as SSE4.1, causing a
> >> >> crash
> >> >> on non-SSE4.1 CPUs.
> >> >> * Allows configuration with compilers with non-GCC-flavor flags for
> >> >> enabling
> >> >> architecture options.
> >> >> * Hopefully makes the configuration and ifdef’s easier to follow and
> >> >> understand.
> >> >>
> >> >> This does not yet switch —enable-intrinsics to be enabled by default on
> >> >> supported architectures, but I think it’d be ready to do so.
> >> >>
> >> >> Comments are welcome!
> >> >>
> >> >>
> >> >>
> >> >> —
> >> >> Jonathan Lennox
> >> >> jonathan at vidyo.com
> >> >>
> >> >>
> >> >> _______________________________________________
> >> >> opus mailing list
> >> >> opus at xiph.org
> >> >> http://lists.xiph.org/mailman/listinfo/opus
> >> >>
> >>
> >
> >


More information about the opus mailing list