[theora-dev] Possible inefficiency in encode.c

Wed Oct 7 20:37:18 PDT 2009

Chris Cooksey wrote:
> Because the tokens are now stored in separate per plane arrays instead of
> all strung together in one big array like they used to be. I presume the
> point of doing that was to eliminate the need for dct_token_offs altogether.

The actual point was so that the token lists could be filled in a
different order than the one in which they will appear in the bitstream.
However, one of the consequences of this is that EOB runs cannot span
lists, even though the bitstream allows it.

This is fixed up after tokenization, before packing the tokens into the
packet, in oc_enc_tokenize_finish(). What this means is that sometimes
the first token in the list must be skipped, because it was an EOB run
that has actually been merged with the last token in a different list.
dct_token_offs[][] marks which lists need to skip such a token (i.e.,
it's always either 0 or 1).

It would actually probably be faster to keep things in a single
contiguous array, with offsets to the individual lists, just because it
would remove an extra indirection that C compilers generally do a poor
job of optimizing. We did this in the decoder, and it did provide a
small speed-up. I just never got around to doing it in the encoder.