[Speex-dev] Prebuffering best practices
David Barrett
dbarrett at quinthar.com
Tue Jun 14 23:34:17 PDT 2005
Ah, I'm sorry, I have read the manual and believe I have a reasonably
good grasp on how to use the Speex encoder and decoder altogether. In
fact I've been using it with great success in my P2P SIP/RTP VoIP
application for almost a year now; it's been working wonderfully and I
can't thank you enough.
However, the manual makes no mention of the jitter buffer, nor does it
(so far as I can tell) address the questions I've raised. The list
archive has been more helpful in this regard, but I still have holes in
my understanding.
Specifically, I'm trying to refine my working system to work even better
over high-jitter connections. I'm eager and open to using the jitter
buffer as you suggest, and I see how it can impove playback quality in
high packet-loss situations considerably, but I haven't yet wrapped my
head around what benefit it offers in reliable, high-jitter environments.
So far as I can tell, the only solution to jittery transport is an
adequate prebuffer, and thus I'm looking for advice on how to determine
what "adequate" means.
Likewise, I can easily broadcast anywhere from 33ms to 500ms audio
packets (I currently use 50ms), but I'd like to hear your real-world
advice on what the ideal packet size is I should be using.
Thanks for all your help!
-david
Jean-Marc Valin wrote:
> I strongly suggest you start by reading the Speex manual (you can skip
> the technical parts about CELP). If you still ask questions, then post
> them.
>
> Jean-Marc
>
> Le mardi 14 juin 2005 à 22:30 -0700, David Barrett a écrit :
>
>>Ok, this is a silly question, but what does the jitter buffer do? I'm
>>really new to audio, so please bear with me.
>>
>> From what I gather (primarily from the list archive), the jitter buffer
>>is a wrapper around the Speex decoder. I give it the packets I receive,
>>in whatever order I receive them, and then it gives me back a clean
>>stream of audio samples. But what I don't entirely understand is how
>>this is different from just working with the decoder directly.
>>
>>Right now, I dump my RTP packets direct into the Speex decoder, and then
>>queue the output for playback. This works reasonably well.
>>
>>However, it doesn't accomodate dropped packets well. If I drop samples
>>10-20, I'll just queue 0-10 and then 20-30 immediately after, which
>>isn't great. I think I read the jitter buffer will fabricate a fake
>>replacement for the missing samples 10-20, and thus improving quality of
>>playback. Is this correct?
>>
>>But what else does it do? I see mention of "clock skew", but I don't
>>know what that means in this context. What am I missing? Most
>>importantly, what does it have to do with jitter, and how can I use it
>>to solve my problems? Specifically:
>>
>>1) Assuming lossless, in-order, but highly irregular delivery of packets
>>(as I'm witnessing), what advantage does the jitter buffer offer over
>>going straight to the Speex decoder?
>>
>>2) Assuming samples arrive at an average rate of 22KHz, but arrive in a
>>highly irregular fashion, is there any way to ensure regular playback
>>other than to just wait some "prebuffer" duration before beginning
>>playback? How do I pick the smallest prebuffer duration to accomodate a
>>given connection's jitter?
>>
>>3) Assuming I want to deliver samples at a rate of 22KHz, what's the
>>best graularity at which to encode and broadcast? Granted, I need to
>>stay beneath the MTU. But should I be going for the largest granularity
>>that fits under the MTU, or should I be going for the smallest
>>granularity that my CPU can churn out?
>>
>>
>>Thanks!
>>
>>-david
>>
>>
>>Jean-Marc Valin wrote:
>>
>>>Have you looked at the Speex (adaptive) jitter buffer? See
>>>speex_jitter.h
>>>
>>> Jean-Marc
>>>
>>>Le mardi 14 juin 2005 à 17:50 -0700, David Barrett a écrit :
>>>
>>>
>>>>What is the best way to pick a prebuffering length for a streaming audio
>>>>application using UDP transport?
>>>>
>>>>I'm using Speex in a VoIP application with RTP transport, currently with
>>>>a fixed 500ms prebuffer on the playback side. However, I'd like
>>>>something a bit more adaptive to accomodate high-jitter connections.
>>>>
>>>>For example, in one test configuration there is a very low average
>>>>round-trip latency (50ms), but it spikes all over the place (sometimes
>>>>10ms, sometimes 500ms). Thus I can't make my prebuffer duration
>>>>proportional to latency, but somehow proportional to "jitter". But I'm
>>>>not sure the best way to quantify this, nor how to tranform that into a
>>>>reasonable prebuffer length.
>>>>
>>>>Thus I'm curious what experience you've had in this area, and what you
>>>>can recommend as a good way to adaptively compute a prebuffer duration.
>>>> Thanks!
>>>>
>>>>-david
>>>>_______________________________________________
>>>>Speex-dev mailing list
>>>>Speex-dev at xiph.org
>>>>http://lists.xiph.org/mailman/listinfo/speex-dev
>>>>
>>
>>_______________________________________________
>>Speex-dev mailing list
>>Speex-dev at xiph.org
>>http://lists.xiph.org/mailman/listinfo/speex-dev
>>
More information about the Speex-dev
mailing list