On 2022-03-24, Andrew Sonzogni wrote:

> Then it’s not possible to mix those packets before decoding them ?

*Theoretically* you can mix without decoding to *some* degree. But the 
algorithm to do so would be...truly nasty. So just don't go there. You 
need to decode, mix, and recode.

BTW, in many cases, you'd want to hold the separate incoming streams as 
separate already, and just pass them on. E.g. in conferencing 
applications, you'd typically not want to mix everything together, 
because 1) the mixer would prove a single point of failure, 2) business 
oriented conferencing is often an application where what people say is 
legally encumbered or might prove so later on, including the mixer, and 
3) you typically don't want more than one person to take the stage 
anyhow, because absent body language and other close interaction, it'd 
just be an auditory mess.
