[Speex-dev] SSE bug on Win32 with GCC 4.2.1
Alexander Chemeris
ipse.speex at gmail.com
Thu Aug 23 01:09:09 PDT 2007
On 8/23/07, Thorvald Natvig <speex at natvig.com> wrote:
> Jean-Marc Valin wrote:
> > Duane Storey a écrit :
> >> Actually, it might just be an OS "feature".. On most linux and mac
> >> platforms, the memory managers align memory on proper boundaries -- this
> >> doesn't occur on most versions of windows. I don't have all the code in
> >> front of me, but it's possible that it's simply a side effect of windows not
> >> aligning the memory, and an implicit assumption in the speex code that it
> >> will have proper alignment.
> >
> > I actually doubt it's the OS (as much as I hate Windows). After all,
> > it's the compiler that manages the stack and needs to ensure alignment.
> >
>
> On Win32, the OS aligns the stack to the native pointer size. Meaning 4
> bytes for most machines.
>
> MS and Intel's compilers explicitly align the stack by "and $-16" it for
> every function that uses SSE. GCC uses a smarter scheme and has the
> caller align the stack so it's aligned to 16 bytes on every function
> entry. This is better as it allows the hardware to pipeline moves to the
> stack without having to wait for the and instruction to retire. However,
> this scheme breaks if the stack wasn't aligned to start with.
>
> The MinGW runtime aligns the stack before entering main(), so all
> functions called from the main() thread are aligned. If you create
> threads on your own, you'll have to align the stack yourself before you
> call your thread entry function.
> Come to think of it, that last paragraph should probably be in the Speex
> Win32 FAQ if there is one.
>
> Anyway. IF you have the "OS misaligned stack and you didn't correct it"
> problem, the thread will crash on the very first SSE load it does, as it
> is unaligned. Since GCC maintains 16-byte alignment relative to the
> starting SP, either ALL stack loads are aligned or they are all unaligned.
FFMpeg had problems with SSE missaligned stack variables under
Windows too. Now, with gcc 4.2 available on Windows they solved it
with __attribute__((force_align_arg_pointer)) set for publicly visible
functions. You may find their patch and short description in this mail:
http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2007-August/034010.html
Speex may follow similar way to solve this problem.
--
Regards,
Alexander Chemeris.
SIPez LLC.
SIP VoIP, IM and Presence Consulting
http://www.SIPez.com
tel: +1 (617) 273-4000
More information about the Speex-dev
mailing list