[Speex-dev] Prebuffering best practices

Wed Jun 22 16:14:47 PDT 2005

The Speex jitter buffer should do what you need automatically.
Unfortunately, I haven't had time to document it in the manual. It's not
hard to use, though.

	Jean-Marc

Le mardi 14 juin 2005 à 23:34 -0700, David Barrett a écrit :
> Ah, I'm sorry, I have read the manual and believe I have a reasonably 
> good grasp on how to use the Speex encoder and decoder altogether.  In 
> fact I've been using it with great success in my P2P SIP/RTP VoIP 
> application for almost a year now; it's been working wonderfully and I 
> can't thank you enough.
> 
> However, the manual makes no mention of the jitter buffer, nor does it 
> (so far as I can tell) address the questions I've raised.  The list 
> archive has been more helpful in this regard, but I still have holes in 
> my understanding.
> 
> Specifically, I'm trying to refine my working system to work even better 
> over high-jitter connections.  I'm eager and open to using the jitter 
> buffer as you suggest, and I see how it can impove playback quality in 
> high packet-loss situations considerably, but I haven't yet wrapped my 
> head around what benefit it offers in reliable, high-jitter environments.
> 
> So far as I can tell, the only solution to jittery transport is an 
> adequate prebuffer, and thus I'm looking for advice on how to determine 
> what "adequate" means.
> 
> Likewise, I can easily broadcast anywhere from 33ms to 500ms audio 
> packets (I currently use 50ms), but I'd like to hear your real-world 
> advice on what the ideal packet size is I should be using.
> 
> Thanks for all your help!
> 
> -david
> 
> Jean-Marc Valin wrote:
> > I strongly suggest you start by reading the Speex manual (you can skip
> > the technical parts about CELP). If you still ask questions, then post
> > them.
> > 
> > 	Jean-Marc
> > 
> > Le mardi 14 juin 2005 à 22:30 -0700, David Barrett a écrit :
> > 
> >>Ok, this is a silly question, but what does the jitter buffer do?  I'm 
> >>really new to audio, so please bear with me.
> >>
> >> From what I gather (primarily from the list archive), the jitter buffer 
> >>is a wrapper around the Speex decoder.  I give it the packets I receive, 
> >>in whatever order I receive them, and then it gives me back a clean 
> >>stream of audio samples.  But what I don't entirely understand is how 
> >>this is different from just working with the decoder directly.
> >>
> >>Right now, I dump my RTP packets direct into the Speex decoder, and then 
> >>queue the output for playback.  This works reasonably well.
> >>
> >>However, it doesn't accomodate dropped packets well.  If I drop samples 
> >>10-20, I'll just queue 0-10 and then 20-30 immediately after, which 
> >>isn't great.  I think I read the jitter buffer will fabricate a fake 
> >>replacement for the missing samples 10-20, and thus improving quality of 
> >>playback.  Is this correct?
> >>
> >>But what else does it do?  I see mention of "clock skew", but I don't 
> >>know what that means in this context.  What am I missing?  Most 
> >>importantly, what does it have to do with jitter, and how can I use it 
> >>to solve my problems?  Specifically:
> >>
> >>1) Assuming lossless, in-order, but highly irregular delivery of packets 
> >>(as I'm witnessing), what advantage does the jitter buffer offer over 
> >>going straight to the Speex decoder?
> >>
> >>2) Assuming samples arrive at an average rate of 22KHz, but arrive in a 
> >>highly irregular fashion, is there any way to ensure regular playback 
> >>other than to just wait some "prebuffer" duration before beginning 
> >>playback?  How do I pick the smallest prebuffer duration to accomodate a 
> >>given connection's jitter?
> >>
> >>3) Assuming I want to deliver samples at a rate of 22KHz, what's the 
> >>best graularity at which to encode and broadcast?  Granted, I need to 
> >>stay beneath the MTU.  But should I be going for the largest granularity 
> >>that fits under the MTU, or should I be going for the smallest 
> >>granularity that my CPU can churn out?
> >>
> >>
> >>Thanks!
> >>
> >>-david
> >>
> >>
> >>Jean-Marc Valin wrote:
> >>
> >>>Have you looked at the Speex (adaptive) jitter buffer? See
> >>>speex_jitter.h
> >>>
> >>>	Jean-Marc
> >>> 
> >>>Le mardi 14 juin 2005 à 17:50 -0700, David Barrett a écrit :
> >>>
> >>>
> >>>>What is the best way to pick a prebuffering length for a streaming audio 
> >>>>application using UDP transport?
> >>>>
> >>>>I'm using Speex in a VoIP application with RTP transport, currently with 
> >>>>a fixed 500ms prebuffer on the playback side.  However, I'd like 
> >>>>something a bit more adaptive to accomodate high-jitter connections.
> >>>>
> >>>>For example, in one test configuration there is a very low average 
> >>>>round-trip latency (50ms), but it spikes all over the place (sometimes 
> >>>>10ms, sometimes 500ms).  Thus I can't make my prebuffer duration 
> >>>>proportional to latency, but somehow proportional to "jitter".  But I'm 
> >>>>not sure the best way to quantify this, nor how to tranform that into a 
> >>>>reasonable prebuffer length.
> >>>>
> >>>>Thus I'm curious what experience you've had in this area, and what you 
> >>>>can recommend as a good way to adaptively compute a prebuffer duration. 
> >>>> Thanks!
> >>>>
> >>>>-david
> >>>>_______________________________________________
> >>>>Speex-dev mailing list
> >>>>Speex-dev at xiph.org
> >>>>http://lists.xiph.org/mailman/listinfo/speex-dev
> >>>>
> >>
> >>_______________________________________________
> >>Speex-dev mailing list
> >>Speex-dev at xiph.org
> >>http://lists.xiph.org/mailman/listinfo/speex-dev
> >>