[vorbis-dev] mdct_backward with fused muladd?

David Etherton etherton at rockstarsandiego.com
Tue May 20 15:12:16 PDT 2003



Can anybody point me at any resources that would explain how to optimize
mdct_backward for a cpu with a fused multiply-accumute unit?

>From what I understand from responses to my older postings, Tremor's
mdct_backward could be rewritten to take advantage of a muladd.

My target machine can do either two-wide 32x32 + Accum(64) -> Accum(64)
integer muladd or eight-wide 16x16 + Accum(32) -> Accum(32) integer muladd
or four-wide single-precision floating-point muladd.

The tremor code seems to be much cleaner and more portable than the stock
version for consoles (no double-precision math routines, compiles more or
less out-of-the-box on a C++ compiler) but I can afford an int-to-float if
necessary.

What values of 'n' does mdct_backward typically get called with?  Should it
be pretty simple to guarantee proper alignment of the input buffers to a
16-byte boundary?  Can I get away with 16x16 multiplies without too much
audio degredation?

I also would be better off without a big sincos lut as pointed out by Segher
Boessenkool back in March.

Thanks again.  Just to show that I'm not a total leech, here's a slightly
faster (at least on the PS2) version of bitrev12 that doesn't use any luts
(thanks to http://aggregate.org/MAGIC/)

STIN int bitrev12(int x){
x = ((x & 0xaaa) >> 1) | ((x & 0x555) << 1);
x = ((x & 0xccc) >> 2) | ((x & 0x333) << 2);
x = ((x & 0xf00) >> 8) | (x & 0x0f0) | ((x & 0x00f) << 8);
return x;
}

-Dave

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list