[theora-dev] [OT] Just saying hi!

Mike Melanson melanson at pcisys.net
Thu Feb 27 09:00:34 PST 2003

On Thu, 27 Feb 2003, Christoph Lampert wrote:

> I rather meant that prefetch instructions have to be executed at the right
> time in order to have the memory avaible at the right time. If executed
> too late, memory latency isn't lowered enough. If the cache is small and
> prefetch executed too early, data is already overwritten before it can be
> used. Thus the right position depends very much on memory and CPU
> architecture. But a console with fixed architecture is of course a
> great benefit for these questions. :) 

        Just for a quick brainstorm (it's fun the think about
optimizations; they're like little puzzles), I thought about maintaining a
separate array for DC coefficients for prediction, per your suggestion.
But now I seriously think that they could be predicted as they are
decoded. The coefficients necessary for prediction have already been
unpacked when it is time to unpack a given DC coeff. These 2 tasks
(unpacking DC, then predicting) could probably be effectively combined
(not done in the original VP3 source).

        As for IDCT stuff, would it be reasonable to prefetch coeff. data
for block n+1 before transforming block n? For the DC's SH-4:

  get address of next block
  prefetch addr+0   ; prefetching gets 32-byte blocks
  prefetch addr+32  ; there are 8x8 coeffs * 2 bytes/coeff = 128 bytes
  prefetch addr+64
  prefetch addr+96
  idct(current block)

This also assumes that the address is aligned on a 32-byte boundary,
another little detail.


	-Mike Melanson

<p><p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'theora-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.

More information about the Theora-dev mailing list