[theora-dev] Multi-Thread Theora Encoder

Unga unga888 at yahoo.com
Thu Oct 11 01:29:30 PDT 2007

--- Felipe Portavales Goldstein <portavales at gmail.com>

> On 10/7/07, Maik Merten <maikmerten at gmx.net> wrote:
> > Nice work. Too bad I'm still on a single-core
> system (but now I have a
> > nice excuse to mothball this system and go ahead
> assembling a new one).
> >
> > Two things I noticed:
> >
> >  - output bitrate seems to vary slightly depending
> on how many threads
> > are used (no visible difference, though). If your
> goal for the
> > optimization is to have it produce exactly the
> same output and you're
> > thinking right now "wait, this shouldn't happen"
> then there may perhaps
> > be a problem in the new code.
> When you use only one thread, the output generated
> is exactly the same
> as the original (non-multi-thread) theora encoder.
> When you use 2 threads, for example, the Motion
> Vector search is
> executed in half of the screen independently (in
> parallel). One half
> for each thread.
> The first fragment of the second half of the screen
> could use the last
> motion vector from the previous fragment on the
> first half of the
> screen. This LAST_MV mode is used to save the bits
> of a new Motion
> Vector added to the stream.
> But, when using the threads, if we want to run the
> motion vector
> search in parallel, we must avoid data dependencies
> between threads
> and therefore we loose a little compression.
> Since the number of threads is small compared to the
> number of rows in
> the entire screen (height of the movie), this loss
> is quite small.
> Resuming, what I am trying to explain is this:
> For each thread we add the possibility of the need
> to introduce a new
> Motion Vector per frame on the stream compared to
> the non-threaded
> version.
> If we have 2 threads, we can have up to 1 MV per
> frame more than the
> non-threaded version.
> If we have 4 threads we can have up to 3 MV per
> frame more than the
> non-threaded version.

"Since the number of threads is small compared to the
number of rows in the entire screen (height of the
movie), this loss is quite small."

May not be the case next year :)

Tilera (www.tilera.com) already ships a 64-core based
processor and plan to release a 128-core processor
soon. What if somebody used a quad 128-core Tilera
machine, you get 512 cores. Is the loss in compression
then still small?

Just out of curiosity, is it possible to compress
frames in parallel? That is, you allocate one thread
per frame and process as many frames in parallel to
the number of core you have. It may consume more RAM
for sure, but RAM nowdays is not considered expensive.

If this is really possible, we don't have to anymore
restrict ourself to one machine, we can distribute
frames to other machines on the network for


Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games.

More information about the theora-dev mailing list