[theora-dev] Theora Fast Quality Reduction Transcoding

Wed Nov 18 19:02:36 PST 2009

Chris Cooksey wrote:
> The rehuff tool works by simply duplicating the entire frame header and then
> emitting a recoded huff stream. This won't work if I need to change the
> coded block list, which ideally I would like to if possible. It is here that

The biggest caveats are:
1) When you mark a block uncoded, its motion vector is assumed to be
(0,0). It may still be worth doing, depending on how many bits you're
trying to throw away, but realize the corresponding distortion has to be
measured against the block from the previous frame, and not whatever
predictor was being used before. You can still leave a block coded with
all of its coefficients set to zero, though in practice the final coded
DC coefficient may be non-zero due to a residual from the DC prediction.
This is reasonably cheap, though not as cheap as completely skipping the
block.
2) When all four luma blocks in a macroblock are uncoded, the macroblock
is _forced_ into INTER_NOMV mode. No actual MB mode or MV is coded in
the bitstream for that macroblock, and any coded chroma blocks will use
the previous reference frame with a (0,0) MV.

The best place to start is probably oc_enc_frame_pack() in encode.c,
which writes a complete encoded packet. Most of the required information
comes out of oc_enc_ctx.state, i.e., it should already be available from
the decoder, from oc_dec_ctx.state. Except for the tokenizer, which
makes a large number of assumptions about how data is arranged, these
routines are fairly small and self-contained.

There are a few exceptions. The MB mode alphabet is not stored in the
decoder, because it is only needed once, when the modes are decoded. It
may be worth re-computing the best alphabet, as the number of MB's with
each mode may change (due to caveat 2 above). The decoder also never
constructs a list of coded MBs; it simply iterates over all of them and
only processes the ones that have a coded luma block (it might be faster
to actually construct the list; I've never tested). I'd suggest changing
the encoder routines to iterate like the decoder does, instead of using
the list, so you don't have to update it when you uncode an entire MB.
The MV coding scheme is also not stored. The optimal one may change for
the same reason the optimal MB alphabet would, although this is
unlikely, but it's simple enough to re-calculate.

So, to summarize, to change a block from coded to uncoded (assuming
you've already taken care of not emitting any tokens for that fragment),
you need to
a) set state.frags[fragi].coded to 0
b) remove it from the state.coded_fragis list, and update both
state.ncoded_fragis for the appropriate color plane as well as
state.ntotal_coded_fragis. This list can be re-built in-place as you
iterate through the blocks.
c) update state.sb_flags[sbi].coded_fully and coded_partially for the
corresponding superblock. It's probably easiest to just go back and
re-compute these flags from scratch for all superblocks using the
frags[] array, but if you do it while retokenizing, you can estimate the
change in the number of bits needed for these flags, which may affect
your decision.
d) If this leaves all four luma blocks in a MB uncoded, then you need to
update the chroma fragments. This requires setting frags[fragj].mb_mode
to OC_MODE_INTER_NOMV and frags[fragj].mv[] to {0,0} for each coded
chroma block, which may also require re-quantizing the corresponding DCT
coefficients relative to the new predictor this produces.

If there are parts of the spec that are unclear, please let us know.
Feedback is welcome.