[Speex-dev] Mixing Codebook?

Ashhar Farhan farhan at phonestack.com
Tue Mar 25 22:33:04 PDT 2008

i have written conferencing servers with speex. you can view download
the code here:

you can just scroll down to the ltpSoundInput and see how the audio
streams are all being mixed and also separated out for each
participant (each participant should get everybody's voice added in
except his own). I would warn you of a number of issues here:

1. speex or for that matter any other codebook based codec will not be
able to do a good job with mulitple voices. as a result while more
than one person are speaking at once, you can make out the clutter of
voices but the intelligibility goes down. I have verified this: two
simultaneous speakers reading standard texts that i tried to write
down while listening to the mixed _and_ speex encoded sound vs. only
mixed sound. speex degrades on multi-voice audio input.

2. It might be better if you stored each voice as a separate speex
channel within the envelope, decode them separately and then mix the
pcm audio. this means twice the bandwidth, twice the CPU power, but
without any degradation. You can even spatial place each speaker in a
stereo output by building delays and attenuation between left and
right channel.

- farhan

- farhan

On Wed, Mar 26, 2008 at 7:59 AM, Ricardo Andere de Mello
<quilombodigital at gmail.com> wrote:
> heh...:)
> maybe I´ll just answer "the number is 42". ;)
> well... some guys did mixing using G.722 encoded frames... maybe we could
> ask them: 8P
> http://ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/4054516/4054517/04054870.pdf?isnumber=4054517&prod=CNF&arnumber=4054870&arSt=2045&ared=2048&arAuthor=G.+Agnello%3B+R.+M.+Dansereau
> and a little off-topic, but fun for audio gurus like you, a very complicated
> way to mix two mp3 streams:
> http://www.freshpatents.com/Processing-of-encoded-signals-dt20060302ptan20060047523.php?type=description
> []s, gandhi
> On Tue, Mar 25, 2008 at 10:01 PM, Jean-Marc Valin
> <jean-marc.valin at usherbrooke.ca> wrote:
> >
> > > ok, lets get to the point. :)
> > > I would need to mix some audio streams realtime. After googling a little
> > > I had only two answers:
> > > 1 - decode,mix,encode
> > > 2 - lower quality, send streams at the same time and mix at the client.
> > >
> > > I didnt like either solutions.
> > >
> > > I´m not an audio expert, so I´m just saying something silly. I readed
> > > that speex uses a fixed "codebook". So I think that in a simplistic way
> > > this means that the encoded data are indexes in this codebook and some
> > > transformation information.
> > > Is it possible to create a "mixing codebook"? the idea is to generate
> > > all the mix combinations of each entry in the codebook., so two encoded
> > > frames could be merged and point to this different codebook, maybe
> > > selected in the rtp header.
> >
> > Well, you *might* be able to do that at the frame level. Considering
> > that 15 kbps corresponds to 300 bits. The the case of mixing two
> > streams, you'd need a 600-bit codebook (2^600 entries). There's just a
> > tiny implementation detail when it comes to storage. What you need to do
> > to solve it is:
> > 1) Find a way to encode a (very very) large number of terabytes in
> > every electron (or other fundamental particle)
> > 2) Manage to use every particle in the universe to store your codebook
> > 3) Prevent the whole thing from turning into a black hole.
> >
> > You do what you want, but I suggest you go with one of the two solutions
> > you listed above (decode,mix,encode or multiple streams).
> >
> >        Jean-Marc
> >
> _______________________________________________
>  Speex-dev mailing list
>  Speex-dev at xiph.org
>  http://lists.xiph.org/mailman/listinfo/speex-dev

More information about the Speex-dev mailing list