Something seems to be wrong with cpu.c CPU detection code. When I compile things with MSVC all instructions except FMA is detected as missing, even though they are present in my CPU. That of course results in awful performance.