[vorbis-dev] Another good optimization (for PPC only, though)

Timothy J. Wood tjw at omnigroup.com
Sun Nov 26 19:42:47 PST 2000



  Using the PPC frsqrte and fres instructions, I got the percentage time take in the smoothing code in _vp_compute_mask() down from 13.64% to 1.88% of the running time.

  In my local copy of Vorbis I have a fast_math.[hc] in vorbis/lib and have a _fast_sqrt() inline in fast_math.h.  If anyone else wants to try it out, it follows.

  I can currently encode my test file (the first 15 seconds of Supervixen by Garbage) in about 11.1 seconds.  Faster than real time, at least :)

-tim

#if defined(__ppc__)
// Vanilla PPC code, but since PPC has a reciprocal square root estimate instruction,
// runs *much* faster than calling sqrt().  We'll use two Newton-Raphson
// refinement steps to get bunch more precision in the 1/sqrt() value for very little cost.
// We'll then invert using the PPC reciprocal estimate instruction and also refine that
// (although we should only need a single refinement step there).
// This is about 8.8 times faster than sqrt() and according to my testing (not exhaustive)
// it returns fairly accurate results (error below 1.0e-5 up to 100000.0 in 0.1 increments).
// In practice in Vorbis, it seems to do even better (the biggest error I saw was down in
// the 1e-10 range).
static inline float _fast_sqrt(float x)
{
    const float half = 0.5;
    const float one  = 1.0;
    float B, y0, y1;
    
    // This'll NaN if it hits frsqrte
    if (x == 0.0)
        return x;
        
    B = x;

#ifdef __GNUC__            
    asm("frsqrte %0,%1" : "=f" (y0) : "f" (B));
#else
    y0 = __frsqrte(B);
#endif

    /* First refinement step */
    y1 = y0 + half*y0*(one - B*y0*y0);

    /* Second refinement step */
    y0 = y1;
    y1 = y0 + half*y0*(one - B*y0*y0);

    /* Now that we have a good estimate of 1/sqrt(x) we need to invert it.  We will also approximate this. */
    B = y1;
    
#ifdef __GNUC__            
    asm("fres %0,%1" : "=f" (y0) : "f" (B));
#else
    y0 = __fres(B);
#endif

    /* Do a Newton-Rhapson refinement step on the reciprocal estimate */
    y1 = y0 + y0*(1.0 - B*y0);

    return y1;
}
#else
static inline double _fast_sqrt(double x)
{
    return sqrt(x);
}
#endif

-tim

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list