[Flac-dev] Altivec Optimizations

Josh Coalson xflac at yahoo.com
Sun Apr 27 01:48:05 PDT 2003

--- Chris Csanady <cc at 137.org> wrote:
> Hi,
> I have been playing with Altivec, and I rewrote a couple of the
> routines
> in assembly.  Looking at the archives, I noticed that there may
> already
> be some effort on this.  Anyways...
> Right now, I have two routines working.  They need to be cleaned up, 
> made
> relocatable, and documented; otherwise, they seem to work fairly
> well.  
> I
> see an overall ~27% speed improvement when encoding with the default
> settings, and greater at -8.
> The ones I have done are:
>  FLAC__lpc_compute_residual_from_qlp_coefficients_16_bit()
>  FLAC__lpc_compute_autocorrelation()
> I did make a change in stream_encoder.c to better align the data
> passed 
> to
> FLAC__lpc_compute_residual_from_qlp_coefficients(), I hope this is
> ok.  
> Most
> occurrences of residual are replaced with residual+order as in:
>  FLAC__fixed_compute_residual(signal+order, residual_samples, order, 
>         residual+order);
>  ...
>  subframe->data.fixed.residual = residual+order;
> The vectors in Altivec must be 16 byte aligned, and it complicates 
> things
> if signal[] and residual[] are not aligned.

Cool, I would appreciate any contributions you and/or Brady
come up with.

As for alignment, there are routines in memory.c to allocate
aligned memory at 32-byte boundaries.  It is turned on with a
#define FLAC__ALIGN_MALLOC_DATA.  Currently this is only turned
on in configure.in for x86 cpu's but you can easily do it for

An asm version of FLAC__lpc_restore_signal_asm should also give
a pretty good bang for the buck on the decoding side.


Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.

More information about the Flac-dev mailing list