[Theora-dev] MMX loop filter for theora-exp
ds at schleef.org
Wed Aug 24 15:55:03 PDT 2005
On Wed, Aug 24, 2005 at 02:53:09PM -0700, Ralph Giles wrote:
> On Thu, Aug 18, 2005 at 12:01:53PM -0600, Mike Melanson wrote:
> > Any chance of writing the ASM in Intel syntax and then compiling it
> > with GPL'd NASM under Linux?
> Note that this also works the other direction: we can continue to use
> gcc inline asm but cross/mingw compile win32/64 .obj files for the
> accelerated routines and ship that as part of the windows source.
> Has the advantage of being able to use the nicer inline syntax, as well
> as the vector extensions and SIMD intrinsics. On the other hand,
> installing a windows-target gcc is a lot harder than 'nasm -f win32'.
> Anybody know what kind of intrinsic support MSVC has? Is it possible to
> write a bridge layer in CPP?
Intrinsics are described by Intel and AMD about as well as the
instructions themselves, i.e., usable but barely adequate.
Nevertheless, there _is_ a spec that all compilers are supposed
to follow, and where there is confusion, IMO, what Intel's compiler
(ICC) does should be considered the standard. Microsoft's
information about MSVC is here:
MSVC likely has as good if not better support than GCC. Naturally,
one can use icc as well, although not for AMD extensions. :) (ICC
compiles alternate slow codepaths that only get run on AMD CPUs. No
For liboil, I'm trying to use intrinsics because they're more portable,
but GCC often generates really bad code when they're used. So I tend
to end up writing inline assembly anyway. I'd divide the blame evenly
between myself and GCC for the poor resulting code, since I haven't
taken the time to figure out GCC's weirdisms, but then GCC a) has
weirdisms and b) has some really stupid ones, too. Obviously, for
liboil, the fastest code always wins, so there's no loss in having
both intrinsic and inline-asm code on Linux, and only intrinsic (but
faster) code on Windows.
Writing raw assembly is least portable among compilers, since there
can be any number of strange things going on, including calling
convention mismatches, PIC flags, symbol names, instruction argument
order, and assembler directives. That's a mess I have chosen not
to get involved in. Some of it could likely be hidden by clever
use of the C preprocessor, but at some point one loses less hair
by switching to using Perl as a preprocessor. At that point, you
can literally do whatever with the instructions. But that requires
a Perl build requirement, which is only slightly less annoying
than requiring GCC. (Also, I'm a big fan of mixing C code and
assembly, since corner cases are easier to handle in C.) (Also,
IMO, raw assembly code is the most likely to be crappily written,
and thus eventually unmaintained.)
Converting between AT&T syntax and Intel syntax isn't too complicated
for a regexp master.
Big Kitten LLC (http://www.bigkitten.com/) -- data acquisition on Linux
More information about the Theora-dev