[Speex-dev] SSE bug on Win32 with GCC 4.2.1

Thorvald Natvig speex at natvig.com
Wed Aug 22 18:16:30 PDT 2007

Jean-Marc Valin wrote:
> Duane Storey a écrit :
>> Actually, it might just be an OS "feature"..  On most linux and mac
>> platforms, the memory managers align memory on proper boundaries -- this
>> doesn't occur on most versions of windows.  I don't have all the code in
>> front of me, but it's possible that it's simply a side effect of windows not
>> aligning the memory, and an implicit assumption in the speex code that it
>> will have proper alignment.
> I actually doubt it's the OS (as much as I hate Windows). After all,
> it's the compiler that manages the stack and needs to ensure alignment.

On Win32, the OS aligns the stack to the native pointer size. Meaning 4 
bytes for most machines.

MS and Intel's compilers explicitly align the stack by "and $-16" it for 
every function that uses SSE. GCC uses a smarter scheme and has the 
caller align the stack so it's aligned to 16 bytes on every function 
entry. This is better as it allows the hardware to pipeline moves to the 
stack without having to wait for the and instruction to retire. However, 
this scheme breaks if the stack wasn't aligned to start with.

The MinGW runtime aligns the stack before entering main(), so all 
functions called from the main() thread are aligned. If you create 
threads on your own, you'll have to align the stack yourself before you 
call your thread entry function.
Come to think of it, that last paragraph should probably be in the Speex 
Win32 FAQ if there is one.

Anyway. IF you have the "OS misaligned stack and you didn't correct it" 
problem, the thread will crash on the very first SSE load it does, as it 
is unaligned. Since GCC maintains 16-byte alignment relative to the 
starting SP, either ALL stack loads are aligned or they are all unaligned.

The case here was that GCC ignores the needs-16-byte alignment of __m128 
when it was inside a union. I found a simple fix though:

Instead of
union {
   float bla[4];
   __m128 blah;

union {
   __m128 blah;
   float bla[4];

In other words, move the __m128 so it's the first entry in the union.

I've been unable to replicate this behaviour on GCC 4.1 or 4.2 running 
on Ubuntu, so it seems to be specific to the mingw build I'm using. 
Which means that one of the many configure options for GCC is causing 
this. I don't really have time to figure out which, and as long as the 
above fix works, I'm happy enough.

More information about the Speex-dev mailing list