[Theora-dev] FPGA implementation/ players speed?
Timothy B. Terriberry
tterribe at vt.edu
Wed Mar 9 12:07:49 PST 2005
Some more data points, on a 3.06 GHz P4.
Pure C experimental decoder
With Rudolf's recent MMX patches (http://ssh.cz/~ruik/patch_theora):
That's commeasurate with the 11% speedup numbers he reported, and is
getting pretty close to real-time. Unlike the VP3HoSwiYo patches, it
does not include an MMX iDCT, so there is room yet for improvement. I'm
reasonably confident that we should be able to get to real-time decoding
at that resolution on this hardware.
The mainline decoder on the same machine:
So pure C optimizations give you about 9%, and the current MMX
optimizations another 11%. Note that all tests were with
CFLAGS=-O2 -fforce-addr -fomit-frame-pointer -finline-functions
With default CFLAGS (just -O2), the mainline takes 11.447 seconds for
just decode. So the compiler alone can make almost a 50% difference.
More information about the Theora-dev