[vorbis-dev] faster mdct's
Steven G. Johnson
stevenj at ab-initio.mit.edu
Sat May 31 20:43:06 PDT 2003
Hello Vorbis folks,
I'm one of the FFTW authors (www.fftw.org), and a few days ago I was
playing with our codelet generator for fun and modified it to spit out
hard-coded MDCTs of small sizes. The code (at
jdj.mit.edu/~stevenj/mdct_128nr.c) for 256 samples (128 outputs) seems to
be almost twice as fast as the Vorbis MDCT code for that size on my 2.2GHz
P-IV (gcc 3.2.2 and flags "-O1 -mcpu=pentium4 -fomit-frame-pointer
-fstrict-aliasing -malign-double"), in single precision.
I guess the other size that is important to you is 2048 samples; for that
size you definitely don't want a hard-coded routine (the linked 256-sample
codelet is just inside the crossover-point for hard-coding, in our
experience). There, you probably want a recursive/looping algorithm,
albeit probably only two stages (e.g. one radix-32 step). FFTW 3 contains
code for a DCT-IV of arbitrary sizes (it works by pre/post-processing a
real-input FFT of half the size) that can be fairly trivially adapted to
the MDCT (which is just a DCT-IV with some input aliasing). I tried that
this evening (see attached mdct.[ch]), but it's only about 30% faster than
the Vorbis MDCT for 2048 samples, although the advantage increases for
larger sizes (e.g. 60% faster for 128k samples). It could be made
substantially more efficient by generating special purpose codelets to
avoid separate pre/post-processing passes...we know our DCT-IV code is not
optimal. (It also doesn't use SIMD.)
The above two codes compute an unwindowed MDCT, and give the same results
as the one in your mdct.c, but I can also easily make one with a window
function built in (to avoid the separate pass/loads).
I'm not sure how much you care about MDCT performance (what fraction of
CPU time is it?), but I thought you might find this interesting anyway.
Regards,
Steven G. Johnson
PS. Hi Monty! You probably don't remember me, but I knew you from ESG at
MIT when we were undergrads (I think you were a year ahead of me?), around
'92. I remember you talking back then about founding a company around
some audio processing software you were working on (some kind of
programmable filtering tool? I forget).
PPS. I wrote up a few notes on the MDCT for Wikipedia
(http://www.wikipedia.org/wiki/Modified_discrete_cosine_transform), with
some help from Frank Klemm of LAME regarding their use in various codecs.
Maybe someone here will find the page useful (or have corrections).
PPPS. I'll be out of town, probably with no email, until Wednesday.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mdct.c
Type: text/x-csrc
Size: 7280 bytes
Desc: mdct.c
Url : http://lists.xiph.org/pipermail/vorbis-dev/attachments/20030531/cf90b139/mdct-0001.c
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mdct.h
Type: text/x-chdr
Size: 1276 bytes
Desc: mdct.h
Url : http://lists.xiph.org/pipermail/vorbis-dev/attachments/20030531/cf90b139/mdct-0001.h
More information about the Vorbis-dev
mailing list