[Speex-dev] Stream Synchronization for Echo Cancellation

Wed Nov 1 07:51:29 PST 2006

> In those cases, when you get let's say 1000 packets of 20ms from the mic
> you may have only 990 packets of 20ms from RTP incoming stream.
> 
> Thus, before sending outgoing mic/RTP stream, you would wait for 1000
> incoming packets: where last packet in fact arrive 10*20ms = 200ms
> after it was supposed to. I have from my experience already seen 4s
> of clock deviation each minutes between one USB headset and other
> sound card....
> 
> In this case, synchronisation is a nightmare. It seems to be similar
> issue than the one described in your link, but the difference is really
> unpredictable and the resolution does not seems as simple...
> 
> Anybody that wish to share experience on this?

Actually, the jitter buffer in Speex tends to cope relatively well with
non-synchronised clocks. The only that that really doesn't like it is
the echo canceller. Even a drift by one sample means that the echo
canceller needs to re-adapt. So as soon as the (local) clocks aren't
*perfectly* synchronised, the echo cancellation performance goes down to
a point where it's mainly unusable.

	Jean-Marc

> Tks,
> Aymeric MOIZARD / ANTISIP
> amsip - http://www.antisip.com
> osip2 - http://www.osip.org
> eXosip2 - http://savannah.nongnu.org/projects/exosip/
> 
> 
> On Wed, 1 Nov 2006, Tom Grandgent wrote:
> 
>> Isn't this the same problem described starting at the bottom of
>> this page?
>> http://www.embeddedstar.com/articles/2003/7/article20030720-11.html
>>
>> Jean-Marc Valin <jean-marc.valin at usherbrooke.ca> wrote:
>>>
>>>> As it says in 5.4.1 of the good book "Using a different soundcard to do
>>>> the capture and playback will *not* work, regardless of what you may
>>>> think. The only exception to that is if the two cards can be made to
>>>> have their sampling clock 'locked' on the same clock source."
>>>>
>>>> It seems to me that it should be possible to achieve synchronization
>>>> using some combination of cross-correlation, clock skew estimation, and
>>>> sample interpolation. But there are so many details to consider, I bet
>>>> it would take a long time to get right.
>>>
>>> When you get that to work, please let me know and we'll publish some
>>> papers about it. Until then, your best hope is in echo *suppression*
>>> (i.e. frequency-dependent gain), although even that could be a bit
>>> tricky.
>>>
>>>     Jean-Marc
>>> _______________________________________________
>>> Speex-dev mailing list
>>> Speex-dev at xiph.org
>>> http://lists.xiph.org/mailman/listinfo/speex-dev
>>
>> _______________________________________________
>> Speex-dev mailing list
>> Speex-dev at xiph.org
>> http://lists.xiph.org/mailman/listinfo/speex-dev
>>
>>
> 
> 
>