[opus] Patch cleaning up Opus x86 intrinsics configury
jonathan at vidyo.com
Tue Mar 3 20:17:42 PST 2015
On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote:
On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at vidyo.com>> wrote:
My patch should be against the tip, but it’s the very recent tip, including some changes this past Friday (27 Feb). I mentioned in the IRC room a problem I discovered in creating my patch, and then later improved the fix Tim had made for the problem. Where do you get conflicts merging it to tip?
In terms of merging, you posted your patch before I posted mine, so probably I should be the one on the hook to rebase after your fix goes in. Looking over your patch quickly, I don’t think any of my changes should be that difficult to merge with yours.
>> Yeah.. trivial merge issues.. shouldn't take too long to resolve.
I haven’t studied your patch in depth yet. Does Ne10 use its own RTCD code, or do you use Opus’s?
>> Not using Ne10's RTCD feature at the moment.. using Opus's rtcd. libopus RTCD for linux/armv7 will not work for aarch64. It needs to be updated (or just disabled if system built for aarch64 since Neon support is mandatory)... It is in my todo list..
My patch should handle aarch64 properly. One of the things it does is it discovers (in configure) whether a little test program using Neon intrinsics can be compiled without -mfpu=neon — if so, it disables RTCD for Neon, and sets a #define OPUS_ARM_PRESUME_NEON_INTR. (This is analogous to the existing code for determining whether Neon assembly instructions can be assembled by default.)
This thus handles not only aarch64, but also iOS armv7.
I also have a number of aarch64 patches pending — mine are entirely for fixed-point though. Are yours all floating-point?
On Mar 3, 2015, at 5:18 PM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote:
> Hello Jonathan,
> I am unable to apply your patch cleanly on tip.
> This patch has some conflicts with my ARM patch that does fft optimizations
> One of us probably has to rebase depending on which patch goes into opus first.
> On 1 March 2015 at 20:47, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at vidyo.com>> wrote:
>> The attached patch cleans up Opus's x86 intrinsics configury.
>> * Makes —enable-intrinsics work with clang and other non-GCC compilers
>> * Enables RTCD for the floating-point-mode SSE code in Celt.
>> * Disables use of RTCD in cases where the compiler targets an instruction
>> set by default.
>> * Enables the SSE4.1 Silk optimizations that apply to the common parts of
>> Silk when Opus is built in floating-point mode, not just in fixed-point
>> * Enables the SSE intrinsics (with RTCD when appropriate) in the Win32
>> * Fixes a case where GCC would compile SSE2 code as SSE4.1, causing a crash
>> on non-SSE4.1 CPUs.
>> * Allows configuration with compilers with non-GCC-flavor flags for enabling
>> architecture options.
>> * Hopefully makes the configuration and ifdef’s easier to follow and
>> This does not yet switch —enable-intrinsics to be enabled by default on
>> supported architectures, but I think it’d be ready to do so.
>> Comments are welcome!
>> Jonathan Lennox
>> jonathan at vidyo.com<mailto:jonathan at vidyo.com>
>> opus mailing list
>> opus at xiph.org<mailto:opus at xiph.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the opus