[Theora-dev] FPGA implementation

Mon Jan 17 12:35:47 PST 2005

> There's a simple tool filec in
> http://svn.xiph.org/experimental/derf/theora-exp/tools/ that can be used
> to compare the output of two dump_video runs and pinpoint the
> location(s) where they differ. The only more complex debugging tools
> I've been using are "sprinkle the code with printf()s" and diff, but I
> guess that's a little bit harder on an FPGA.

I need little guidance to test the FPGA code that generates Theora data.
So far I did not work on the software part and just achieved the state
when data from the hardware matches that from simulation. To run the FPGA
I first fill in some tables including quantizer, dequantizer, zero bin,
Huffman (now I can use concurrently 30 of 80 tables due to the internal
block memory size. Just two types of frames are supported - intra and
inter (nomv), no loop filter yet.There is also a way to provide a bit-map
of which 32x32 tiles will use 1 of two alternative quantization sets and
which will be coded. So without changing tables FPGA can generate 3 types
of frames - intra, inter with all blocks coded and inter with only
selected blocks coded). Output data is the actual frame data only, no
headers/tables (including which blocks are coded) that can be built by
software in advance. To simplify merging frame tables with the FPGA data
output (provided by DMA channel to system memory) FPGA receives bit shift
(to the nearest 32-bit word) where to start bitstream so no software
shifting of video data is required.

Currently the frame size is multiple of 128x64 pixels (it can be made
smaller in the future, but for 2048x1536 sensors it is not too coarse
already).

The input data can be loaded to frame buffer, but before th Bayer->YCbCr
4:2:0 conversion as it is done between the frame buffer and the
compressor. It is also possible to monitor reference frame in the frame
buffer (the only part easy to test as after
DCT->quantizer->dequantize-IDCT it should be close to the input frame and
it is :-) )

It is also possible to get to the intermediate frame data - fixed-width
(12-bit) "pretokens" that are generated in the macroblock-scan order
(after DCT->quantizer->DC prediction and preliminary coding) and later
retrived back by a separate process (lagging one full frame) processed
(Huffman encoded) in the coded order.

I also "sprinkled code with printf" - (and implemented some "#ifdef
DEBUG") functionality that I may need if the FPGA will behave (generate
different data from the simulator) - there still could be some hardware
issues as there are many asynchronous events and processes so it is
impossible to simulate exactly. Current minimal frame (128x64) simulates
some 15 minuts/frame so it is  possible to get to the algorithmic
problems.

So after 4 months of a really hard work I'm very impatient to see if this
data makes any sense for Theora decoder. What are the minimal steps to
build something viewable? Maybe starting from just a simple 128x64 intra
frame? Is there any way to view a standalone encoded frame?

I'm trying to make some shortcuts - for now I can build all the headers
manually if needed - just to see hardware works so I can release cameras
into production.