[Flac-dev] Altivec Optimizations
Josh Coalson
xflac at yahoo.com
Sun Apr 27 01:48:05 PDT 2003
--- Chris Csanady <cc at 137.org> wrote:
> Hi,
>
> I have been playing with Altivec, and I rewrote a couple of the
> routines
> in assembly. Looking at the archives, I noticed that there may
> already
> be some effort on this. Anyways...
>
> Right now, I have two routines working. They need to be cleaned up,
> made
> relocatable, and documented; otherwise, they seem to work fairly
> well.
> I
> see an overall ~27% speed improvement when encoding with the default
> settings, and greater at -8.
>
> The ones I have done are:
>
> FLAC__lpc_compute_residual_from_qlp_coefficients_16_bit()
> FLAC__lpc_compute_autocorrelation()
>
> I did make a change in stream_encoder.c to better align the data
> passed
> to
> FLAC__lpc_compute_residual_from_qlp_coefficients(), I hope this is
> ok.
> Most
> occurrences of residual are replaced with residual+order as in:
>
> FLAC__fixed_compute_residual(signal+order, residual_samples, order,
> residual+order);
> ...
> subframe->data.fixed.residual = residual+order;
>
> The vectors in Altivec must be 16 byte aligned, and it complicates
> things
> if signal[] and residual[] are not aligned.
Cool, I would appreciate any contributions you and/or Brady
come up with.
As for alignment, there are routines in memory.c to allocate
aligned memory at 32-byte boundaries. It is turned on with a
#define FLAC__ALIGN_MALLOC_DATA. Currently this is only turned
on in configure.in for x86 cpu's but you can easily do it for
powerpc.
An asm version of FLAC__lpc_restore_signal_asm should also give
a pretty good bang for the buck on the decoding side.
Josh
__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com
More information about the Flac-dev
mailing list