[opus] [PATCH] fix alignment exceptions

Felicia Lim flim at google.com
Thu Aug 24 12:35:17 UTC 2017


Thanks, this has been merged in master.

On Tue, Aug 22, 2017 at 4:34 PM Jonathan Lennox <jonathan at vidyo.com> wrote:

> Hm, yes — with the current Xcode clang (Xcode 8.3.3, Apple LLVM version
> 8.1.0 (clang-802.0.42)), I don’t see any difference in the generated code
> with -O2 between the two versions.  (Literally none for x86_64, and only
> what seem to be some non-signficiant offset changes for i386).
>
> So if this improves some other cases (notably, it should avoid problems
> with -fsanitize=address) I think this is okay.
>
> On Aug 18, 2017, at 5:25 PM, Ray Essick <essick at google.com> wrote:
>
> Jonathan,
> Here's the code difference we see with the recent change -- what amounts
> to reverting your change from a couple years back.
> It doesn't look like we're getting superfluous instructions from clang now.
>
> the bad behavior for us was the alignment exception on the movdqa
> instructions when the input data wasn't 128-bit aligned.
> We had to change something because the code as-is was taking alignment
> faults on the movdqa instructions.
>
> For reference, the clang version  I used for this is:
>     | Android clang version 5.0.300080  (based on LLVM 5.0.300080)
>     | Target: x86_64-unknown-linux
>
>
> If we think enough people use older versions of clang, a version of the
> patch that looked at __clang_major__ and friends seems fair.
>
> -- Ray
>
> 826% diff -c *old *new
> *** pitch_sse4_1.s-old 2017-08-18 13:51:39.359084637 -0700
> --- pitch_sse4_1.s-new 2017-08-18 13:51:54.595106450 -0700
> ***************
> *** 73,80 ****
>   cmpl $4, %eax
>   jl .LBB0_8
>   # BB#7:
> ! movdqa (%edx,%edi,2), %xmm2
> ! movdqa (%esi,%edi,2), %xmm1
>   addl $4, %edi
>   movdqa %xmm2, %xmm3
>   pmullw %xmm1, %xmm2
> --- 73,80 ----
>   cmpl $4, %eax
>   jl .LBB0_8
>   # BB#7:
> ! movq (%edx,%edi,2), %xmm2    # xmm2 = mem[0],zero
> ! movq (%esi,%edi,2), %xmm1    # xmm1 = mem[0],zero
>   addl $4, %edi
>   movdqa %xmm2, %xmm3
>   pmullw %xmm1, %xmm2
>
>
> On Fri, Aug 18, 2017 at 12:11 PM, Felicia Lim <flim at google.com> wrote:
>
>> We see the MOVQ instruction but this patch deliberately uses it rather
>> than MOVQDA (load 128-bits aligned). We were seeing that with the trace
>> below, the final invocation is not 128-bit aligned but MOVQDA insists on it
>> (the calling function was pitch_sse4_1.c:90, in the 4-way N - i >= 4
>> loop).
>>
>> 07-31 11:00:13.469   210  2540 <(469)%20210-2540> D opus_sse1: RBE
>> celt_inner_prod_sse4_1: x 0xeff3deb0 y 0xeff3deb0 N 32
>> 07-31 11:00:13.469   210  2540 <(469)%20210-2540> D opus_sse1: RBE
>> celt_inner_prod_sse4_1: x 0xeff3d7b0 y 0xeff3d7b0 N 32
>> 07-31 11:00:13.469   210  2540 <(469)%20210-2540> D opus_sse1: RBE
>> celt_inner_prod_sse4_1: x 0xeff3df30 y 0xeff3df30 N 32
>> 07-31 11:00:13.470   210  2540 <(470)%20210-2540> D opus_sse1: RBE
>> celt_inner_prod_sse4_1: x 0xeff3d850 y 0xeff3d850 N 48
>> 07-31 11:00:13.470   210  2540 <(470)%20210-2540> D opus_sse1: RBE
>> celt_inner_prod_sse4_1: x 0xeff3dfd0 y 0xeff3dfd0 N 48
>> 07-31 11:00:13.470   210  2540 <(470)%20210-2540> D opus_sse1: RBE
>> celt_inner_prod_sse4_1: x 0xeff3d8b0 y 0xeff3d8b0 N 64
>> 07-31 11:00:13.470   210  2540 <(470)%20210-2540> D opus_sse1: RBE
>> celt_inner_prod_sse4_1: x 0xeff3e030 y 0xeff3e030 N 64
>> 07-31 11:00:13.476   210  2540 D opus_sse1: RBE celt_inner_prod_sse4_1: x
>> 0xeff3da38 y 0xeff3da38 N 36
>>
>>
>> On Fri, Aug 18, 2017 at 11:44 AM Jonathan Lennox <jonathan at vidyo.com>
>> wrote:
>>
>>> This would revert a patch I submitted two years ago (
>>> http://git.xiph.org/?p=opus.git;a=commitdiff;h=1d60b49e9d95672a17ebe5578319c59fa3963224
>>> ).
>>>
>>> At the time, clang produced an unnecessary MOVQ instruction when it
>>> compiled with the version of the code with the explicit _mm_loadl_epi64
>>> intrinsic.  Do you no longer see that?
>>>
>>> How does the code compile for you, and what is the issue you’re seeing?
>>>  (One issue might be that clang’s address sanitizer isn’t smart enough to
>>> know that PMOVSXWD only loads 8 bytes, despite _mm_cvtepi16_epi32’s
>>> argument being an __mm128i; I’ve seen it trigger incorrect out-of-bounds
>>> read errors.)
>>>
>>> On Aug 18, 2017, at 12:34 PM, Felicia Lim <flim at google.com> wrote:
>>>
>>> Hi,
>>>
>>> Please find attached a patch to fix alignment exceptions. Without this
>>> change, we were seeing occasional alignment faults when using this with
>>> clang.
>>>
>>> Thanks,
>>> Felicia
>>>
>>> <e5c277c.diff>_______________________________________________
>>> opus mailing list
>>> opus at xiph.org
>>> http://lists.xiph.org/mailman/listinfo/opus
>>>
>>>
>>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/opus/attachments/20170824/3ab942e2/attachment.html>


More information about the opus mailing list