[Speex-dev] SSE bug on Win32 with GCC 4.2.1
speex at natvig.com
Wed Aug 22 18:16:30 PDT 2007
Jean-Marc Valin wrote:
> Duane Storey a écrit :
>> Actually, it might just be an OS "feature".. On most linux and mac
>> platforms, the memory managers align memory on proper boundaries -- this
>> doesn't occur on most versions of windows. I don't have all the code in
>> front of me, but it's possible that it's simply a side effect of windows not
>> aligning the memory, and an implicit assumption in the speex code that it
>> will have proper alignment.
> I actually doubt it's the OS (as much as I hate Windows). After all,
> it's the compiler that manages the stack and needs to ensure alignment.
On Win32, the OS aligns the stack to the native pointer size. Meaning 4
bytes for most machines.
MS and Intel's compilers explicitly align the stack by "and $-16" it for
every function that uses SSE. GCC uses a smarter scheme and has the
caller align the stack so it's aligned to 16 bytes on every function
entry. This is better as it allows the hardware to pipeline moves to the
stack without having to wait for the and instruction to retire. However,
this scheme breaks if the stack wasn't aligned to start with.
The MinGW runtime aligns the stack before entering main(), so all
functions called from the main() thread are aligned. If you create
threads on your own, you'll have to align the stack yourself before you
call your thread entry function.
Come to think of it, that last paragraph should probably be in the Speex
Win32 FAQ if there is one.
Anyway. IF you have the "OS misaligned stack and you didn't correct it"
problem, the thread will crash on the very first SSE load it does, as it
is unaligned. Since GCC maintains 16-byte alignment relative to the
starting SP, either ALL stack loads are aligned or they are all unaligned.
The case here was that GCC ignores the needs-16-byte alignment of __m128
when it was inside a union. I found a simple fix though:
In other words, move the __m128 so it's the first entry in the union.
I've been unable to replicate this behaviour on GCC 4.1 or 4.2 running
on Ubuntu, so it seems to be specific to the mingw build I'm using.
Which means that one of the many configure options for GCC is causing
this. I don't really have time to figure out which, and as long as the
above fix works, I'm happy enough.
More information about the Speex-dev