[theora-dev] [OT] Just saying hi!

Mike Melanson melanson at pcisys.net
Wed Feb 26 07:15:49 PST 2003



On Wed, 26 Feb 2003, Christoph Lampert wrote:

> I guess there is no documentation of the current status of Theora 
> and it's basic (technical) concepts, is there? Because reading sources is
> okay, but sometimes a few explanations make it much easier. 

        I'm working on a description for the VP3 decode system as I
develop a fresh decoder implementation for the ffmpeg project
(ffmpeg.sf.net). Here is what I can tell you about the decode process
(from which you can make educated guesses about the encode process):

* decoding a VP3 frame:
  * decode frame header (keyframe, quantization level, version #)
  * unpack superblock/macroblock/fragment encoding data
  * unpack encoding mode information for encoded blocks
  * unpack motion vectors
  * unpack DC coefficients for all coded fragments
  * unpack 1st AC coefficients for all coded fragments
  * unpack 2nd AC coefficients for all coded fragments
  *  ...
  * unpack 63rd AC coefficients for all coded fragments
  * reconstruct frame, which entails:
   * prediction for the DC coefficient
   * calculating relevant motion vector, if applicable
   * IDCT to obtain fragment or fragment diff
   * apply diff to motion block, if applicable

That's basically how decoding works. Note that I still haven't gotten
heavy into the interframe coding process, so I might have some details
confused on the motion prediction stuff.

> I'm particularly interested in how you manage to keep everything free of
> patents, in particular since I know that there are patents (claimed) on
> such trivial stuff like motion vector prediction and decision when to
> encode and when to skip a block. 

        I often wonder the same. So here are some observations on things
that probably set this algorithm apart:

* You probably noticed that the algorithm encodes all the DC coeffs, then
all the 1st AC coeffs, etc. MPEG certainly doesn't do that and I don't
know of any other coding methods that do.

* Block coding: In addition to 8x8 blocks (fragments) and 16x16
macroblocks, VP3 also uses superblocks which are 32x32, encapsulating 4
macroblocks. The order of unpacking the fragments is also fairly unique.

* DC prediction: On2 used to be Duck and if you are familiar with their
algorithms (Duck Truemotion variants and the DK ADPCM codecs), you know
that they are huge DPCM fanatics. To that end, there seems to be a lot
that goes on for DC prediction. I know that MPEG uses some DC prediction,
but I don't think it's anywhere near this level. A fragment can predict DC
using DC elements from the left, up-left, up, and up-right fragments, if
present and coded in the current frame. Additionally, there is some more
code that is effectively disabled that does a special search through
previous fragments to the up and left of the current fragment looking for
*some* block that has been coded in this frame from which DC can be
predicted. Aggressive strategy.

* Golden frame: VP3 calls its keyframes "golden frames". The codec needs
to maintain the last golden frame in addition to the previous decoded
frame since motion vectors can be predicted from either.

        If I have gotten anything wrong about anything stated above,
please yell very loudly right now. But I hope this helps.

        Thanks...

--
	-Mike Melanson

<p><p><p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'theora-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.




More information about the Theora-dev mailing list