[Speex-dev] How does the jitter buffer "catch up"?

Jean-Marc Valin Jean-Marc.Valin at USherbrooke.ca
Sun Sep 18 19:17:25 PDT 2005

> > Most of it is right. Actually, would you mind if I use part of your
> > email for documenting the jitter buffer in the manual?
> It would be my pleasure :)

Thanks. Whenever I have some time to update the manual I'll put that in.

> >> early_ratio_XX is the sum of all the positive bins.
> >> late_ratio_XX is the sum of all the negative bins.
> >
> > Right. And only the packets that are "just in time" don't get counted in
> > any ratio.
> Well.. they're counted in the ontime_ratio_long and _short, right?

Right. It's there so I know how many late packets I'll have if I drop a

> One thing that might be worth mentioning: the sum of all the margins will 
> never be higher than 1.0, so a test for early_ratio_short > 0.7 means 
> (roughly) that 70% or more of the packets in the last short-term time 
> period were early.

Note that the sum can be <1 if the buffer had a reset recently.

> > I'm not sure here what will happen. Normally, you'd want to make the
> > buffer larger than what you expect to have in it. In that case, the
> > jitter buffer would likely drop frames until it catches up.
> There's a problem with increasing the buffer size, btw: you need to change 
> the header, which means you need to recompile both speex and your 
> application. So changing the maximum number of buffered packets means you 
> can't share libspeex.dll/.so with other applications.

I agree, which is why making the buffer dynamic is on the TODO list.

> Yes, but assume that after a long steady period, your network latency 
> suddenly drops with 100ms. (100ms is excessive, but I see 60ms quite 
> frequently from users on DSL/Cable connections who also do a bit of P2P 
> on the same line)
> What happens now is that the +100ms bin starts increasing steadily,
> and suddenly it's enough to skip a frame.
> A frame is skipped, and the histogram gets shifted.
> On the next call to _get(), it's now the +80ms bin that has that high 
> value, and the ratio is still more than high enough to skip a frame.
> A frame is skipped, and the histogram gets shifted.
> Repeat for +60, +40 and +20. In short, over a period to decode 5 frames, 
> we're also skipping 5 frames, which means you have 100ms of audio that 
> sounds weird.

Yes. And the fix would simply be to wait for silence periods (e.g.
between words) before dropping frames. It's also on the TODO list.

> Come to think of it, it might actually be better if it just skipped 5 
> frames at once. Might be doable by shifting the histogram, and if it still 
> meets the criteria, keep skipping and shifting it until it doesn't meet 
> the criteria anymore. More work though, and less clear code.

I can probably do that after the drop during silence.


Jean-Marc Valin <Jean-Marc.Valin at USherbrooke.ca>
Université de Sherbrooke

More information about the Speex-dev mailing list