[vorbis-dev] Altivec-enabled libvorbis...

Ryan C. Gordon icculus at clutteredmind.org
Sun Oct 12 20:22:24 PDT 2003



Hey guys,

I just released my new MacOSX-based OpenAL implementation...part of it 
is a Ogg Vorbis decoder based on the 1.0 reference libraries. I spent 
some time optimizing them and found that many of the hotspots in 
libvorbis are perfect candidates for vectorization, so I wrote Altivec 
versions of them.

The end result? Decoding of a .ogg file is between 30 and 50% faster on 
a Mac with an Altivec unit over the stock reference libraries...which 
doesn't suck. Decoding is still a little faster even without Altivec due 
to some other optimizations that don't involve vectorization.

<p>I'm not putting together a patch (because, honestly, my changes aren't 
pretty), but if it's worth anything to anyone, the optimized libraries 
are here:

   http://cvs.icculus.org/horde/chora/cvs.php/osx/AL_EXT_vorbis?rt=al_osx

(or, to check it out from CVS:
   cvs -z9 -d:pserver:anonymous at cvs.icculus.org:/cvs/cvsroot co al_osx
    ...password is "anonymous").

libvorbis was a great candidate for Altivec because it does a ton of 
math on floating point numbers that almost always seem to align to 16 
byte offsets. Similiar results are probably possible on x86 chips with 
the SSE instruction set (MMX, 3DNow, etc too?)

There are one or two good optimization wins that resulted from code 
changes that have nothing to do with vectorization, too (moving branches 
and invariant code out of loops, forcing things into registers, etc).

Overall, some .ogg files seem to spend a lot of time in 
vorbis_lsp_to_curve()...I assume this .ogg is from an older version of 
the vorbis spec, and such files tend to eat more CPU (although there are 
some wins here by inlining the the lookup table functions and using the 
frsqrte opcode instead of the invsqrt lookup table...memory access is a 
huge bottleneck on the Mac, so recalculating things is frequently faster 
than using a lookup table)...these files lean towards the 30% speedup 
side of the field. Other .oggs (newer version?) seem to skip this 
function altogether and spend a lot of time in mdct.c...where most of 
the vectorization occurs...these files lean towards a 50% speedup.

Anyhow, if someone wanted to get these changes into a mainline 
libvorbis, they should:

- diff libogg and libvorbis from that CVS.
- Make sure the build system #defines MACOSX=1 and gcc is invoked with 
the -faltivec command line (-O3, -ffast-math, and -falign-loops=16 are 
huge helps, too).
- Change the _al_has_vector_unit() define in misc.h to point to a static 
variable in libvorbis and set that variable in a convenient place during 
initialization. Currently this variable exists in my AL implementation 
and not libvorbis. The code to detect an Altivec unit in MacOSX looks 
like this:

      #include <CoreServices/CoreServices.h>
      long cpufeature = 0;
      OSErr err = Gestalt(gestaltPowerPCProcessorFeatures, &cpufeature);
      if (err == noErr) {
          if ((1 << gestaltPowerPCHasVectorInstructions) & cpufeature)
              VectorUnitDetected = 1;
      }

  Using this code will need "-framework Carbon" on the gcc and ld 
commandline.

Non-Mac platforms should already have _al_has_vector_unit() #defined to 
be (0), and the *_vectorized functions are inlined stubs, so branches 
and functions should be optimized out...but ideally, they should get 
filled in with SSE/whatever code.

Theoretically, besides the OSX-specific API for detecting the vector 
unit, the Altivec code should work on PowerPC Linux and other PPC-based 
OSes that use gcc (well, Apple's Altivec extensions...IBM's compiler and 
maybe CodeWarrior handle them, too).

If this is useful to anyone, feel free to grab it from my CVS...I don't 
plan to touch the code anymore unless something is really broken, but 
I'll answer any questions people have about it.

--ryan.

<p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list