[theora-dev] Proposal for replacing asm code with intrinsics

Ralph Giles giles at xiph.org
Tue Oct 13 17:58:21 PDT 2009


On Tue, Oct 13, 2009 at 6:14 AM, Sukhomlinov, Vadim
<vadim.sukhomlinov at intel.com> wrote:

> I'm new to Theora and would like to propose several performance optimization using advanced instructions in x86 CPUs (SSE2-SSE4.2).

Welcome. Several others have commented on various points. My rough
memory of this:

1) There are no good options for cross-platform assembly.
2) Last time we tried intrinsics didn't seem worthwhile. but I'd be
happy to see that that's changed.
3) Even if intrinsics unified the gcc and msvc assembly for x86, we
still need to add things like NEON for arm, so I expect we will have
multiple asm versions that must be sync'd anyway.
4) Past benchmarking showed SSE2 wasn't much better than MMX. That's
why there's only one sse routine in the 1.1 code.

So, if you can make it go faster, that would be interesting,
regardless of which representation we end up trying to maintain.

 -r


More information about the theora-dev mailing list