[vorbis-dev] _LOW_ACCURACY_ good enough?
David Etherton
etherton at rockstarsandiego.com
Fri May 23 08:13:04 PDT 2003
I spent a fair amount of time optimizing tremor for the PS2, mostly by using dual-pipe multiplies in the X[N]PRODnn and the window apply code. Then, just for kicks, I re-enabled _LOW_ACCURACY_ and lo and behold it was still substantially faster. I also got some gains out of tremor by changing the longs in cookbook and sharedbook to ogg_int32_t's like I did for vorbis.
I think _LOW_ACCURACY_ is a win mostly because I'm entirely cache-bound in mdct_backward. But my question is, is it good enough for production work? I diffed a few output files and it looked like the largest differences in sixteen bit data were not more than one or two lsb's. Are there any situations where it fails spectacularly I should watch out for?
Looking at the lowmem-branch code of tremor, it looks like it downsamples to 16bits as the last step of mdct_backward (which makes a lot of sense), but some of the cvs comments lead me to believe that _LOW_ACCURACY_ doesn't work yet. I'm hoping lowmem-branch will be an even bigger win for me because I can afford more cpu time than cache misses (or memory for that matter).
(For what it's worth, my current test case in tremor _LOW_ACCURACY_ runs in about 630M cycles; without _LOW_ACCURACY_ but with other PS2-specific optimizations, it runs in 710M cycles; vorbis with some minor floating-point-muladd optimizations and the longs changed to ogg_int32_t runs in about 800M cycles -- I suspect the huge trig lookups in mdct are killing me there).
-Dave
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Vorbis-dev
mailing list