[Speex-dev] transfering decoder state

Thu Nov 16 21:27:34 PST 2006

Hi Jean Marc and everyone else :-)

I am a bit of an alien here, my expertise is not at all in DSP: I am in 
network games and recently I started to work in a team dedicated to 
audio for games.

Our prefered codec is speex, mainly for its quality/bandwidth ratio and 
of course because it is open source.

You DSP guys would probably curse us if you saw what kind of ugly things 
we do to compressed audio flows. One of them being: violently switching 
a flow from one machine to another. I guess it is not a nice thing to do 
and it ends up with weird audio artefacts sounding a bit like a 
submarine ping usually. (By the way, we have really good genuine reasons 
to want to do that :-)

To be honest I have not yet much of a clue on how speex works, so anyone 
stop me right away if I say something stupid. The encoder/decoder are 
based on linear prediction and the information sent is somehow related 
to the errors in these predictions. More importantly, the decoder is a 
state machine, which has a state and is "in tune" with its encoder. 
Forwarding the speex flow to a decoder with an uninitialised (or not "in 
tune") state creates these audio artefacts I was talking about.

If what I say above is kind of correct, copying the state of the first 
decoder into the new decoder would solve the problem. So maybe I could 
push a decoder state to another machine accross the network.

If I can do that, I will also have to consider the size of the state to 
be sent. It may be too big to be sent in a short burst, and anyway 
chances that we will not be able to reconstruct the state N (the 
effective state on the original decoder after packet n and before packet 
n+1) on the new decoder precisely before  the packet n+1 arrives on that 
machine. And even if we can, this would slightly interrupts the audio 
anyway.

But hey, it already works (kind of) without sending the state accross at 
all, so maybe the whole state is not needed and only a part of it could 
be send at the cost of some audio artefact. The matter here is to 
balance the annoyance caused by the interruption of the audio flow, if I 
send the whole state, and the annoyance of the audio artefacts if I 
don't synchronise any state.

I am ready to dive into the speex source code to do that, but I am sure 
I could use some of your thoughts on the problem. Also I would 
appreciate if somebody could point me towards the structure(s) in the 
source code in which the decoder state(s) are stored as it would save me 
quite a bit of time.

Thank you for reading me that far :-)

Jeremy Brun
PhD Candidate
University of Wollongong, Australia

PS: I would have search a bit in the archive of the mailing list if only 
there was a tool for it. If there was already realted topic there you 
can remember of, tell me so I can have a read without asking stupid 
questions.