[vorbis-dev] mdct_backward with fused muladd?

Michael Smith msmith at xiph.org
Tue May 20 16:34:16 PDT 2003



On Wednesday 21 May 2003 08:12, David Etherton wrote:
> Can anybody point me at any resources that would explain how to optimize
> mdct_backward for a cpu with a fused multiply-accumute unit?

MDCT optimisation is not my area of expertise, but I'll give some other advice 
anyway...

>
> From what I understand from responses to my older postings, Tremor's
> mdct_backward could be rewritten to take advantage of a muladd.
>
> My target machine can do either two-wide 32x32 + Accum(64) -> Accum(64)
> integer muladd or eight-wide 16x16 + Accum(32) -> Accum(32) integer muladd
> or four-wide single-precision floating-point muladd.
>
> The tremor code seems to be much cleaner and more portable than the stock
> version for consoles (no double-precision math routines, compiles more or
> less out-of-the-box on a C++ compiler) but I can afford an int-to-float if
> necessary.

Well... it's _only_ a decoder, whereas the stock version includes the encoder. 
This naturally makes it a lot simpler - most of the complexities are in the 
encoder (for example, no double precision floats are needed for the decoder).

>
> What values of 'n' does mdct_backward typically get called with?  Should it
> be pretty simple to guarantee proper alignment of the input buffers to a
> 16-byte boundary?  Can I get away with 16x16 multiplies without too much
> audio degredation?

I think multiples of 2 from 64 to 8192 are allowed, and the most common will 
be 128 (or 256) and 2048 (or 4096 at very low bitrates). I'd have to check 
those, though. Alignment should be simple enough to guarantee. 16x16 
multiplies probably won't give acceptable audio quality.

<p>>
> I also would be better off without a big sincos lut as pointed out by
> Segher Boessenkool back in March.

If this is because of the memory usage of the luts, you may be interested in 
looking at the Tremor 'lowmem-branch' branch, out of cvs. It uses (I'm told, 
I haven't tried it myself) about an order of magnitude less memory 
(heap+stack). That's at a cost of marginally higher cpu usage (10-15%?), but 
that might be a worthwhile tradeoff on a console.

Mike

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list