[Speex-dev] Prebuffering best practices

David Barrett dbarrett at quinthar.com
Tue Jun 14 22:30:33 PDT 2005

Ok, this is a silly question, but what does the jitter buffer do?  I'm 
really new to audio, so please bear with me.

 From what I gather (primarily from the list archive), the jitter buffer 
is a wrapper around the Speex decoder.  I give it the packets I receive, 
in whatever order I receive them, and then it gives me back a clean 
stream of audio samples.  But what I don't entirely understand is how 
this is different from just working with the decoder directly.

Right now, I dump my RTP packets direct into the Speex decoder, and then 
queue the output for playback.  This works reasonably well.

However, it doesn't accomodate dropped packets well.  If I drop samples 
10-20, I'll just queue 0-10 and then 20-30 immediately after, which 
isn't great.  I think I read the jitter buffer will fabricate a fake 
replacement for the missing samples 10-20, and thus improving quality of 
playback.  Is this correct?

But what else does it do?  I see mention of "clock skew", but I don't 
know what that means in this context.  What am I missing?  Most 
importantly, what does it have to do with jitter, and how can I use it 
to solve my problems?  Specifically:

1) Assuming lossless, in-order, but highly irregular delivery of packets 
(as I'm witnessing), what advantage does the jitter buffer offer over 
going straight to the Speex decoder?

2) Assuming samples arrive at an average rate of 22KHz, but arrive in a 
highly irregular fashion, is there any way to ensure regular playback 
other than to just wait some "prebuffer" duration before beginning 
playback?  How do I pick the smallest prebuffer duration to accomodate a 
given connection's jitter?

3) Assuming I want to deliver samples at a rate of 22KHz, what's the 
best graularity at which to encode and broadcast?  Granted, I need to 
stay beneath the MTU.  But should I be going for the largest granularity 
that fits under the MTU, or should I be going for the smallest 
granularity that my CPU can churn out?



Jean-Marc Valin wrote:
> Have you looked at the Speex (adaptive) jitter buffer? See
> speex_jitter.h
> 	Jean-Marc
> Le mardi 14 juin 2005 à 17:50 -0700, David Barrett a écrit :
>>What is the best way to pick a prebuffering length for a streaming audio 
>>application using UDP transport?
>>I'm using Speex in a VoIP application with RTP transport, currently with 
>>a fixed 500ms prebuffer on the playback side.  However, I'd like 
>>something a bit more adaptive to accomodate high-jitter connections.
>>For example, in one test configuration there is a very low average 
>>round-trip latency (50ms), but it spikes all over the place (sometimes 
>>10ms, sometimes 500ms).  Thus I can't make my prebuffer duration 
>>proportional to latency, but somehow proportional to "jitter".  But I'm 
>>not sure the best way to quantify this, nor how to tranform that into a 
>>reasonable prebuffer length.
>>Thus I'm curious what experience you've had in this area, and what you 
>>can recommend as a good way to adaptively compute a prebuffer duration. 
>>  Thanks!
>>Speex-dev mailing list
>>Speex-dev at xiph.org

More information about the Speex-dev mailing list