[opus] Patch cleaning up Opus x86 intrinsics configury

Mon Mar 9 06:38:15 PDT 2015

That works for me.  Thanks!

On Mar 7, 2015, at 1:30 PM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote:

> Hello Jonathan,
> 
> Just FYI, I started doing review of your patch and will get back to
> you in few days. After review, I would like to rebase your patch (as
> necessary) myself and do some testing.. and re-submit.
> 
> Regards,
> Vish
> 
> On 4 March 2015 at 09:00, Viswanath Puttagunta
> <viswanath.puttagunta at linaro.org> wrote:
>> 
>> On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at vidyo.com> wrote:
>>> 
>>> On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta
>>> <viswanath.puttagunta at linaro.org> wrote:
>>> 
>>> 
>>> 
>>> On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com> wrote:
>>>> 
>>>> Viswenath,
>>>> 
>>>> My patch should be against the tip, but it’s the very recent tip,
>>>> including some changes this past Friday (27 Feb).  I mentioned in the IRC
>>>> room a problem I discovered in creating my patch, and then later improved
>>>> the fix Tim had made for the problem.  Where do you get conflicts merging it
>>>> to tip?
>>>> 
>>>> In terms of merging, you posted your patch before I posted mine, so
>>>> probably I should be the one on the hook to rebase after your fix goes in.
>>>> Looking over your patch quickly, I don’t think any of my changes should be
>>>> that difficult to merge with yours.
>>> 
>>>>> Yeah.. trivial merge issues.. shouldn't take too long to resolve.
>>>> 
>>>> 
>>>> I haven’t studied your patch in depth yet.  Does Ne10 use its own RTCD
>>>> code, or do you use Opus’s?
>>> 
>>>>> Not using Ne10's RTCD feature at the moment.. using Opus's rtcd. libopus
>>>>> RTCD for linux/armv7 will not work for aarch64. It needs to be updated (or
>>>>> just disabled if system built for aarch64 since Neon support is
>>>>> mandatory)... It is in my todo list..
>>> 
>>> 
>>> My patch should handle aarch64 properly.  One of the things it does is it
>>> discovers (in configure) whether a little test program using Neon intrinsics
>>> can be compiled without -mfpu=neon — if so, it disables RTCD for Neon, and
>>> sets a #define OPUS_ARM_PRESUME_NEON_INTR.  (This is analogous to the
>>> existing code for determining whether Neon assembly instructions can be
>>> assembled by default.)
>>> 
>>> This thus handles not only aarch64, but also iOS armv7.
>> This sounds good to me. I wanted to do more thorough code review for
>> your patch, but am waiting until it applies cleanly on tip.. But now
>> that we agree that your patch should be rebased on top of my patch,
>> I'll either wait for Timothy/opus-dev to take in my patch, and your
>> patch to be rebased? And if this is taking too long for my patch to go
>> in (Due to opus-dev maintainers time availability), I'll rebase
>> (retaining your authorship ofcourse) your patch on my submitted
>> patch... and put in my future work on top of your patch.
>> 
>> But I really hope we can get some cycles from Timothy/Jean to get the
>> patches into opus. It seems they are really busy for last month or so.
>> 
>>> 
>>> I also have a number of aarch64 patches pending — mine are entirely for
>>> fixed-point though.  Are yours all floating-point?
>> Yes, so far they are floating point. Can you post your fixed-point
>> optimizations? Also, are any of your patches using NE10 library?
>> 
>> So far, what I submitted is to optimize opus_fft using NE10 (float
>> only). I also modified mdct_forward to take advantage of this
>> optimization.
>> 
>> My next action plan:
>> - use opus_ifft to optimize mdct_backwards (float via Ne10)
>> - Re-work/add opus_fft and opus_ifft for fixed (again via Ne10)
>> - Enable for Aarch64 and test (I'm hoping your patch already does this)
>> - float-to-int and int-to-float conversions using neon
>> - etc.
>> 
>> 
>>> 
>>>> On Mar 3, 2015, at 5:18 PM, Viswanath Puttagunta
>>>> <viswanath.puttagunta at linaro.org> wrote:
>>>> 
>>>>> Hello Jonathan,
>>>>> 
>>>>> I am unable to apply your patch cleanly on tip.
>>>>> 
>>>>> Timothy/opus-dev,
>>>>> 
>>>>> This patch has some conflicts with my ARM patch that does fft
>>>>> optimizations
>>>>> http://lists.xiph.org/pipermail/opus/2015-March/002904.html
>>>>> http://lists.xiph.org/pipermail/opus/2015-March/002905.html
>>>>> 
>>>>> One of us probably has to rebase depending on which patch goes into opus
>>>>> first.
>>>>> 
>>>>> Regards,
>>>>> Vish
>>>>> 
>>>>> 
>>>>> On 1 March 2015 at 20:47, Jonathan Lennox <jonathan at vidyo.com> wrote:
>>>>>> The attached patch cleans up Opus's x86 intrinsics configury.
>>>>>> 
>>>>>> It:
>>>>>> * Makes —enable-intrinsics work with clang and other non-GCC compilers
>>>>>> * Enables RTCD for the floating-point-mode SSE code in Celt.
>>>>>> * Disables use of RTCD in cases where the compiler targets an
>>>>>> instruction
>>>>>> set by default.
>>>>>> * Enables the SSE4.1 Silk optimizations that apply to the common parts
>>>>>> of
>>>>>> Silk when Opus is built in floating-point mode, not just in fixed-point
>>>>>> mode.
>>>>>> * Enables the SSE intrinsics (with RTCD when appropriate) in the Win32
>>>>>> build.
>>>>>> * Fixes a case where GCC would compile SSE2 code as SSE4.1, causing a
>>>>>> crash
>>>>>> on non-SSE4.1 CPUs.
>>>>>> * Allows configuration with compilers with non-GCC-flavor flags for
>>>>>> enabling
>>>>>> architecture options.
>>>>>> * Hopefully makes the configuration and ifdef’s easier to follow and
>>>>>> understand.
>>>>>> 
>>>>>> This does not yet switch —enable-intrinsics to be enabled by default on
>>>>>> supported architectures, but I think it’d be ready to do so.
>>>>>> 
>>>>>> Comments are welcome!
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> —
>>>>>> Jonathan Lennox
>>>>>> jonathan at vidyo.com
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> opus mailing list
>>>>>> opus at xiph.org
>>>>>> http://lists.xiph.org/mailman/listinfo/opus
>>>>>> 
>>>> 
>>> 
>>>