[vorbis-dev] faster mdct's

John Ripley jripley at rioaudio.com
Tue Jun 3 03:23:32 PDT 2003



> From: Michael Smith [mailto:msmith at xiph.org]
> To: vorbis-dev at xiph.org
> Subject: Re: [vorbis-dev] faster mdct's
> 
> On Sunday 01 June 2003 13:43, Steven G. Johnson wrote:
> > Hello Vorbis folks,
> >
> > I'm one of the FFTW authors (www.fftw.org), and a few days ago I was
> > playing with our codelet generator for fun and modified it 
> > to spit out hard-coded MDCTs of small sizes.  The code (at
> > jdj.mit.edu/~stevenj/mdct_128nr.c) for 256 samples (128 
> > outputs) seems to be almost twice as fast as the Vorbis MDCT code
> > for that size on my 2.2GHz
> > P-IV (gcc 3.2.2 and flags "-O1 -mcpu=pentium4 -fomit-frame-pointer
> > -fstrict-aliasing -malign-double"), in single precision.
....
> > I'm not sure how much you care about MDCT performance (what 
> > fraction of CPU time is it?), but I thought you might find
> > this interesting anyway.

> Steven,
> 
> Since nobody else has answered you, I thought I should say something.
> 
> MDCT performance is insignificant on encode, but takes a 
> substantial (not sure what percentage, but it's non-trivial)
> amount of time on decode. However, decode is not really a
> performance problem on 'desktop-class' cpus - it only 
> really matters a lot for embedded use (either things like 
> consoles - some people are quite interested in increasing
> performance on the PS2, for example - or portable players,
> though in the latter case floating  point hardware is 
> an unheard-of luxury, so this isn't directly relevent here).

Still, it does have an application if the code generator could be modified
to use integer-only instructions on other architectures (e.g ARM, or a DSP
like 56k). Saving 50% in the MDCT would translate to having something like
(ball park) 20% more battery life on portables, for example. My only concern
there is the size of code generated - portables tend to have little memory
available both for code and data.

Block sizes 256 and 2048 are the most common but not the only ones
encountered, so technically you would need an MDCT for all sizes from 64 to
8192. It would be interesting if you could avoid this by generating the code
on the fly, but then the code generator would have to be smaller than the
total of all MDCTs :)

So, how hard would it be to get your code generator integer-only and
non-x86? It would certainly save a hell of a lot of hand optimisation.

- John Ripley
--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list