[theora-dev] [OT] Just saying hi!
chl at math.uni-bonn.de
Thu Feb 27 09:56:24 PST 2003
On Thu, 27 Feb 2003, Mike Melanson wrote:
> As for IDCT stuff, would it be reasonable to prefetch coeff. data
> for block n+1 before transforming block n? For the DC's SH-4:
> get address of next block
> prefetch addr+0 ; prefetching gets 32-byte blocks
> prefetch addr+32 ; there are 8x8 coeffs * 2 bytes/coeff = 128 bytes
> prefetch addr+64
> prefetch addr+96
> idct(current block)
Well, sure, this is sufficient if idct takes more cycles than memory
prefetch (should be). If the data is already in cache, it's some wasted
instructions. It might be better to distribute the prefetches over the
routine, so they can be paired with other instructions and execute in
parallel. I don't know, I'm not into architecture of these modern CPUs...
But for Dreamcast there should be documentation about how and where to
prefetch, since it's essential for game programming.
Btw. on my Athlon XP, our fastest 8x8 DCT is with extended 3Dnow, it
takes about 256 cycles (yes, we do have some ASM gurus ;-) when operating
in L1 cache. So this is indeed about the time of a cache-miss, but on
another plattform it might be different.
An interesting tool for mem latency is "Calibrator". It might be good
to check these for your architecture before planning data structures.
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'theora-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Theora-dev