[Speex-dev] Who is using the jitter buffer?

Mon Mar 20 16:24:08 PST 2006

Jean-Marc Valin <Jean-Marc.Valin at USherbrooke.ca> wrote:
> 
> > I would think you might also do better if you interleaved packets when 
> > you did this: instead of sending packets like this: [0,1] [2,3] [4,5] 
> > [6,7], send them like this: [0,2] [1,3] [4,6] [5,7] In this way, if one 
> > packet is dropped you don't lose two consecutive voice frames.
> 
> No, but you make the latency even worse. I think 01, 23, 45, ... is
> still probably best.

How about a compromise then: use 01, 23, 45, ... most of the time, but 
switch to 02, 13, 46, ... when single packets are getting dropped on a 
regular basis.  But, this would only pay off if:

1) Having only one packet dropped at a time happens frequently enough
2) [0,2] [lost] [4,6] really sounds much better than [0,1] [lost] [4,5]
3) You're putting a minimal number of frames into each packet (like 2)
4) You have enough control of your protocol to implement such a system

Still, it sounds like an interesting idea.  I like the idea of a system 
that adapts to network conditions in a smart way.  (This is the reason 
we use a jitter buffer, right?)  What I'm worried about is the code 
complexity of such a system becoming difficult to manage as more 
features like this get combined.

I almost wanted to say that even perfect correction of 40ms gaps isn't 
that useful, since humans can deal with these gaps on their own (even 
without interpolation) and not miss what was said.  But it's a matter 
of smoothing over an annoyance, which could very well be worthwhile.  
And maybe it could help a little if it happened to be adjacent to a 
larger packet loss event that caused bigger problems.

Tom