Here <a href="http://lists.xiph.org/pipermail/theora/2004-September/000619.html">http://lists.xiph.org/pipermail/theora/2004-September/000619.html</a> Andrey describe encoder structure, this like:<div><span class="Apple-style-span" style="font-family: Arial; font-size: medium; "><pre>
"I see the following structure of the compressor implemented in the FPGA
(Xilinx Spartan 3 1000K gates):
1. Data from the external frame buffer (FB) memory goes to the
Bayer-to-YCbCr (4:2:0) converter in overlapping 20x20 tiles that produce 6
8x8 blocks (one macroblock) on the output.
2. Corresponding 6 blocks from the previous frame are fetched from the
same FB in parallel, subtracted from the new frame (if it is not a golden)
and processed by the DCT and quantizator.
3. After the quantizator data in one branch goes through dequantizator,
IDCT and back to FB to be fetched with the next frame.
4. In parallel to (3) 64 coefficients are RLL encoded and saved to the FB.
At least at first - no EOB runs covering several blocks as the blocks will
be processed in a single-pass macroblock order, not plane order.
5. Separate process will fetch tokens (or just their fixed-length
RLL-encoded equivalents) from FB in the index order, the bitstream will be
built and transfered to the system (separate from the FB) memory using DMA
channel. CPU will run software to add all the required headers,
encapsulate the stream and send it out." Andrey (c) <a href="http://elphel.com">elphel.com</a></pre><pre><br></pre><pre>but i think in paragraph 4 is he was mistaken. May be RLE, not RLL?</pre></span></div>