[vorbis-dev] optimizing float to int conversions

Monty xiphmont at xiph.org
Wed May 5 13:22:47 PDT 2004



On Wed, May 05, 2004 at 09:53:59PM +0200, Tal wrote:
> 1. I don't think that converting int to float is a problem.

Well it is, but less of one.

> 2. For converting float to int, do we want to be depended upon the
> compiler optimizations? 

"That is what they are there for."  If we are not to rely on
compilers, we should be writing everything in assembly.  Actually,
Alan Turing argued even assembly was too high level... he advocated
*only* using binary...

Still, it's unfortunate that compilers are providing poor optimization
results.  

...and don;t forget... going through contortions in the C for one
compiler often makes things worse for some other selections of
compilers (or what improves things for GCC on one platform has a
negative effect on another.  FPU opts are often like this)

>     Why do we need a genious compiler when there is a simple macro that
> does the trick?

Because the genius compiler is often the only way to manage
optimizations over a hundred platforms.  "Others have done this work
so we don't have to."

Do you really want to be hand scheduling an eight-pipeline, seventeen
stage processor with several hundred bypass conditions?  Its probably
necessary in some places, but hand tuned code what runs in 10 cycles
on a Xeon may take a hundred on an Athlon (and vice-versa).

>     Moreover, I think that using the FPU for conversions even when you
> are not "blind" is slower then this simple integer algorithm.

Depends entirely on the processor/architecture.  It's not even
strictly true or false across all x86-compatible processors.  But I
will agree that the compiler alone is producing suboptimal results.
That I've witnessed first hand in the assembly from older GCCs.

Have you analyzed the assembly yet to see what the compiler is doing?
Saying 'fistp is slow!' isn't true when the real slowdown is the
previous ten non-vectorizable FPU setup statements.

Doing things by manipulating bitpatterns in assembly is often faster,
but not if the processor has to fetch a value off the FPU stack,
manipulate it, then put it right back for the next float operation.
That will also often result in lousy pipeline utilization as the whole
processor stalls.

Monty
--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list