[CELT-dev] Mixing of compressed streams

Mon Nov 28 16:45:10 PST 2011

On Mon, Nov 28, 2011 at 6:18 PM, Benjamin M. Schwartz
<bmschwar at fas.harvard.edu> wrote:
> Of course it always possible: just decompress, mix, and recompress!
>
> The usual question is: how much CPU does this cost, and can we save some
> by not fully decoding the streams?
> The answer is:
> 1.  It doesn't cost a lot of CPU.  Opus is designed to be CPU-efficient
> for both encode and decode.
> 2.  The best way to save CPU is probably to optimize the encoder and
> decoder, which presently have very little in the way of performance
> optimizations.
> 2a.  Once the encoder and decoder are highly optimized, then you can
> probably save an addition 10-20% of CPU time by implementing a
> transform-domain mixer, provided that the two streams are in the same mode.
>
> The problem with 2a is to convince yourself that the CPU time is really
> more expensive than the engineers' salaries.

Great observation with respect the best way to speed up mixing is to
speed up the codec. This is very true, especially now due to the
relative lack of effort that has gone into platform specific
performance optimization, though I wouldn't have thought of it.

One possibility you did not mention is the kind of 'mixing' where you
hard switch based on activity, e.g. automatic half-duplex cut-through
mixing, whoever is talking is 100% of the output (presumably for
everyone but them). You can do this cheaply without actually mixing
(though with some possibility of glitching)... if we provided a
function to give you an activity level from a stream without doing a
full decode.

I think this would be less work than many other things— it's mostly
copy and pasting to make a really abbreviated decoder... though it's
not suitable for mixing except in the most constrained environments.