[opus] [RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.

John Ridges jridges at masque.com
Thu Mar 12 13:24:55 PDT 2015


Nit: in dual_inner_prod_sse, why not do both horizontal sums at the same 
time? As in:

     xsum1 = _mm_add_ps(_mm_movelh_ps(xsum1, xsum2), 
_mm_movehl_ps(xsum2, xsum1));
     xsum1 = _mm_add_ps(xsum1, _mm_shuffle_ps(xsum1, xsum1, 0xf5));
     _mm_store_ss(xy1, xsum1);
     _mm_store_ss(xy2, _mm_movehl_ps(xsum1, xsum1));

--John



More information about the opus mailing list