[Speex-dev] Stream Synchronization for Echo Cancellation

Coffey, Michael mcoffey at avistar.com
Fri Nov 10 19:04:47 PST 2006

Following up on the original topic of synchronization between the local
mic and local speaker streams:

We can separate this problem into two sub-problems: (1) compensating for
differences in sampling rates; and (2) compensating for delay between
the two streams.

For estimating the delay, what do you think of the idea of using


-----Original Message-----
From: Jean-Marc Valin [mailto:jean-marc.valin at usherbrooke.ca] 
Sent: Wednesday, November 01, 2006 7:51 AM
To: Aymeric Moizard
Cc: Tom Grandgent; Coffey, Michael; speex-dev at xiph.org
Subject: Re: [Speex-dev] Stream Synchronization for Echo Cancellation

> In those cases, when you get let's say 1000 packets of 20ms from the
> you may have only 990 packets of 20ms from RTP incoming stream.
> Thus, before sending outgoing mic/RTP stream, you would wait for 1000
> incoming packets: where last packet in fact arrive 10*20ms = 200ms
> after it was supposed to. I have from my experience already seen 4s
> of clock deviation each minutes between one USB headset and other
> sound card....
> In this case, synchronisation is a nightmare. It seems to be similar
> issue than the one described in your link, but the difference is
> unpredictable and the resolution does not seems as simple...
> Anybody that wish to share experience on this?

Actually, the jitter buffer in Speex tends to cope relatively well with
non-synchronised clocks. The only that that really doesn't like it is
the echo canceller. Even a drift by one sample means that the echo
canceller needs to re-adapt. So as soon as the (local) clocks aren't
*perfectly* synchronised, the echo cancellation performance goes down to
a point where it's mainly unusable.


> Tks,
> amsip - http://www.antisip.com
> osip2 - http://www.osip.org
> eXosip2 - http://savannah.nongnu.org/projects/exosip/
> On Wed, 1 Nov 2006, Tom Grandgent wrote:
>> Isn't this the same problem described starting at the bottom of
>> this page?
>> http://www.embeddedstar.com/articles/2003/7/article20030720-11.html
>> Jean-Marc Valin <jean-marc.valin at usherbrooke.ca> wrote:
>>>> As it says in 5.4.1 of the good book "Using a different soundcard
to do
>>>> the capture and playback will *not* work, regardless of what you
>>>> think. The only exception to that is if the two cards can be made
>>>> have their sampling clock 'locked' on the same clock source."
>>>> It seems to me that it should be possible to achieve
>>>> using some combination of cross-correlation, clock skew estimation,
>>>> sample interpolation. But there are so many details to consider, I
>>>> it would take a long time to get right.
>>> When you get that to work, please let me know and we'll publish some
>>> papers about it. Until then, your best hope is in echo *suppression*
>>> (i.e. frequency-dependent gain), although even that could be a bit
>>> tricky.
>>>     Jean-Marc
>>> _______________________________________________
>>> Speex-dev mailing list
>>> Speex-dev at xiph.org
>>> http://lists.xiph.org/mailman/listinfo/speex-dev
>> _______________________________________________
>> Speex-dev mailing list
>> Speex-dev at xiph.org
>> http://lists.xiph.org/mailman/listinfo/speex-dev

More information about the Speex-dev mailing list