<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<div class="">Hm, yes — with the current Xcode clang (Xcode 8.3.3, Apple LLVM version 8.1.0 (clang-802.0.42)), I don’t see any difference in the generated code with -O2 between the two versions.  (Literally none for x86_64, and only what seem to be some non-signficiant
 offset changes for i386).</div>
<div class=""><br class="">
</div>
<div class="">So if this improves some other cases (notably, it should avoid problems with -fsanitize=address) I think this is okay.</div>
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On Aug 18, 2017, at 5:25 PM, Ray Essick <<a href="mailto:essick@google.com" class="">essick@google.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div dir="ltr" class="">Jonathan,
<div class="">Here's the code difference we see with the recent change -- what amounts to reverting your change from a couple years back.</div>
<div class="">It doesn't look like we're getting superfluous instructions from clang now.</div>
<div class=""><br class="">
</div>
<div class="">the bad behavior for us was the alignment exception on the movdqa instructions when the input data wasn't 128-bit aligned.</div>
<div class="">We had to change something because the code as-is was taking alignment faults on the movdqa instructions.</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">For reference, the clang version  I used for this is:</div>
<div class="">    | Android clang version 5.0.300080  (based on LLVM 5.0.300080)<br class="">
</div>
<div class="">    | Target: x86_64-unknown-linux</div>
</div>
<div class=""><br class="">
</div>
<div class=""><br class="">
</div>
<div class="">If we think enough people use older versions of clang, a version of the patch that looked at __clang_major__ and friends seems fair.</div>
<div class=""><br class="">
</div>
<div class="">-- Ray</div>
<div class=""><br class="">
</div>
<div class="">
<div class="">826% diff -c *old *new</div>
<div class="">*** pitch_sse4_1.s-old<span style="white-space:pre" class=""> </span>
2017-08-18 13:51:39.359084637 -0700</div>
<div class="">--- pitch_sse4_1.s-new<span style="white-space:pre" class=""> </span>
2017-08-18 13:51:54.595106450 -0700</div>
<div class="">***************</div>
<div class="">*** 73,80 ****</div>
<div class="">  <span style="white-space:pre" class=""></span>cmpl<span style="white-space:pre" class="">
</span>$4, %eax</div>
<div class="">  <span style="white-space:pre" class=""></span>jl<span style="white-space:pre" class="">
</span>.LBB0_8</div>
<div class="">  # BB#7:</div>
<div class="">! <span style="white-space:pre" class=""></span>movdqa<span style="white-space:pre" class="">
</span>(%edx,%edi,2), %xmm2</div>
<div class="">! <span style="white-space:pre" class=""></span>movdqa<span style="white-space:pre" class="">
</span>(%esi,%edi,2), %xmm1</div>
<div class="">  <span style="white-space:pre" class=""></span>addl<span style="white-space:pre" class="">
</span>$4, %edi</div>
<div class="">  <span style="white-space:pre" class=""></span>movdqa<span style="white-space:pre" class="">
</span>%xmm2, %xmm3</div>
<div class="">  <span style="white-space:pre" class=""></span>pmullw<span style="white-space:pre" class="">
</span>%xmm1, %xmm2</div>
<div class="">--- 73,80 ----</div>
<div class="">  <span style="white-space:pre" class=""></span>cmpl<span style="white-space:pre" class="">
</span>$4, %eax</div>
<div class="">  <span style="white-space:pre" class=""></span>jl<span style="white-space:pre" class="">
</span>.LBB0_8</div>
<div class="">  # BB#7:</div>
<div class="">! <span style="white-space:pre" class=""></span>movq<span style="white-space:pre" class="">
</span>(%edx,%edi,2), %xmm2    # xmm2 = mem[0],zero</div>
<div class="">! <span style="white-space:pre" class=""></span>movq<span style="white-space:pre" class="">
</span>(%esi,%edi,2), %xmm1    # xmm1 = mem[0],zero</div>
<div class="">  <span style="white-space:pre" class=""></span>addl<span style="white-space:pre" class="">
</span>$4, %edi</div>
<div class="">  <span style="white-space:pre" class=""></span>movdqa<span style="white-space:pre" class="">
</span>%xmm2, %xmm3</div>
<div class="">  <span style="white-space:pre" class=""></span>pmullw<span style="white-space:pre" class="">
</span>%xmm1, %xmm2</div>
</div>
<div class=""><br class="">
</div>
</div>
<div class="gmail_extra"><br class="">
<div class="gmail_quote">On Fri, Aug 18, 2017 at 12:11 PM, Felicia Lim <span dir="ltr" class="">
<<a href="mailto:flim@google.com" target="_blank" class="">flim@google.com</a>></span> wrote:<br class="">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr" class="">
<div class="">We see the MOVQ instruction but this patch deliberately uses it rather than MOVQDA (load 128-bits aligned). We were seeing that with the trace below, the final invocation is not 128-bit aligned but MOVQDA insists on it (the calling function was
 pitch_sse4_1.c:90, in the 4-way <span style="font-family:monospace;font-size:12px;white-space:pre-wrap" class="">N - i >= 4
</span>loop).</div>
<div class=""><br class="">
</div>
<div class=""><span style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">07-31 11:00:13.<a href="tel:(469)%20210-2540" value="+14692102540" target="_blank" class="">469   210  2540</a> D opus_sse1: RBE celt_inner_prod_sse4_1: x 0xeff3deb0
 y 0xeff3deb0 N 32</span><br style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">
<span style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">07-31 11:00:13.<a href="tel:(469)%20210-2540" value="+14692102540" target="_blank" class="">469   210  2540</a> D opus_sse1: RBE celt_inner_prod_sse4_1: x 0xeff3d7b0 y 0xeff3d7b0
 N 32</span><br style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">
<span style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">07-31 11:00:13.<a href="tel:(469)%20210-2540" value="+14692102540" target="_blank" class="">469   210  2540</a> D opus_sse1: RBE celt_inner_prod_sse4_1: x 0xeff3df30 y 0xeff3df30
 N 32</span><br style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">
<span style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">07-31 11:00:13.<a href="tel:(470)%20210-2540" value="+14702102540" target="_blank" class="">470   210  2540</a> D opus_sse1: RBE celt_inner_prod_sse4_1: x 0xeff3d850 y 0xeff3d850
 N 48</span><br style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">
<span style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">07-31 11:00:13.<a href="tel:(470)%20210-2540" value="+14702102540" target="_blank" class="">470   210  2540</a> D opus_sse1: RBE celt_inner_prod_sse4_1: x 0xeff3dfd0 y 0xeff3dfd0
 N 48</span><br style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">
<span style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">07-31 11:00:13.<a href="tel:(470)%20210-2540" value="+14702102540" target="_blank" class="">470   210  2540</a> D opus_sse1: RBE celt_inner_prod_sse4_1: x 0xeff3d8b0 y 0xeff3d8b0
 N 64</span><br style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">
<span style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">07-31 11:00:13.<a href="tel:(470)%20210-2540" value="+14702102540" target="_blank" class="">470   210  2540</a> D opus_sse1: RBE celt_inner_prod_sse4_1: x 0xeff3e030 y 0xeff3e030
 N 64</span><br style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">
<span style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">07-31 11:00:13.476   210  2540 D opus_sse1: RBE celt_inner_prod_sse4_1: x 0xeff3da38 y 0xeff3da38 N 36</span><br style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif" class="">
</div>
<div dir="ltr" class="">
<div class=""><br class="">
</div>
<br class="">
<div class="gmail_quote">
<div dir="ltr" class="">On Fri, Aug 18, 2017 at 11:44 AM Jonathan Lennox <<a href="mailto:jonathan@vidyo.com" target="_blank" class="">jonathan@vidyo.com</a>> wrote:<br class="">
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word" class="">
<div class="">This would revert a patch I submitted two years ago (<a href="http://git.xiph.org/?p=opus.git;a=commitdiff;h=1d60b49e9d95672a17ebe5578319c59fa3963224" target="_blank" class="">http://git.xiph.org/?p=opus.<wbr class="">git;a=commitdiff;h=<wbr class="">1d60b49e9d95672a17ebe5578319c5<wbr class="">9fa3963224</a>).</div>
<div class=""><br class="">
</div>
<div class="">At the time, clang produced an unnecessary MOVQ instruction when it compiled with the version of the code with the explicit _mm_loadl_epi64 intrinsic.  Do you no longer see that?</div>
<div class=""><br class="">
</div>
<div class="">How does the code compile for you, and what is the issue you’re seeing?  (One issue might be that clang’s address sanitizer isn’t smart enough to know that PMOVSXWD only loads 8 bytes, despite _mm_cvtepi16_epi32’s argument being an __mm128i; I’ve
 seen it trigger incorrect out-of-bounds read errors.) </div>
<div class=""><br class="">
</div>
<div class="">
<blockquote type="cite" class=""></blockquote>
</div>
</div>
<div style="word-wrap:break-word" class="">
<div class="">
<blockquote type="cite" class="">
<div class="">On Aug 18, 2017, at 12:34 PM, Felicia Lim <<a href="mailto:flim@google.com" target="_blank" class="">flim@google.com</a>> wrote:</div>
<br class="m_-190920853688853449m_-8800483185505812600m_6725329152546787984Apple-interchange-newline">
</blockquote>
</div>
</div>
<div style="word-wrap:break-word" class="">
<div class="">
<blockquote type="cite" class="">
<div class="">
<div dir="ltr" class="">
<div class=""><span style="font-size:13px;color:rgb(33,33,33)" class="">Hi,</span></div>
<div class=""><span style="font-size:13px;color:rgb(33,33,33)" class=""><br class="">
</span></div>
<div class=""><span style="font-size:13px;color:rgb(33,33,33)" class="">Please find attached a patch to fix alignment exceptions. </span><span style="font-size:13px;color:rgb(33,33,33)" class="">Without this change, we were seeing occasional alignment faults
 when using this with clang.</span></div>
<div class=""><span style="font-size:13px;color:rgb(33,33,33)" class=""><br class="">
</span></div>
<div class=""><span style="font-size:13px;color:rgb(33,33,33)" class="">Thanks,</span></div>
<div class=""><span style="font-size:13px;color:rgb(33,33,33)" class="">Felicia</span></div>
</div>
</div>
</blockquote>
</div>
</div>
<div style="word-wrap:break-word" class="">
<div class="">
<blockquote type="cite" class="">
<div class=""><span id="m_-190920853688853449m_-8800483185505812600m_6725329152546787984cid:15df62dbda7a4b2172f1" class=""><e5c277c.diff></span>________________<wbr class="">______________________________<wbr class="">_<br class="">
opus mailing list<br class="">
<a href="mailto:opus@xiph.org" target="_blank" class="">opus@xiph.org</a><br class="">
<a href="http://lists.xiph.org/mailman/listinfo/opus" target="_blank" class="">http://lists.xiph.org/mailman/<wbr class="">listinfo/opus</a><br class="">
</div>
</blockquote>
</div>
<br class="">
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</body>
</html>