[Speex-dev] Stream Synchronization for Echo Cancellation

Jean-Marc Valin jean-marc.valin at usherbrooke.ca
Wed Nov 1 14:59:31 PST 2006

>> Actually, the jitter buffer in Speex tends to cope relatively well with
>> non-synchronised clocks.
> Can you explain why?
> My problem is not at all related to local input/output non-synchronised
> clocks: my problem is really between non-synchronised clock between one
> PC and another...

What happens is that my jitter buffer is designed without any explicit
clock. The pace at which you get data from it is assumed to be the local
clock. The jitter buffer is designed to buffer just enough packets to
prevent most packets from arriving late. It means that when we are in
steady-state conditions (jitter not changing). In that case:
1) If the packets arrive to fast (remote clock is faster), then it will
discard packets once in a while to maintain the optimal buffer size.
2) If the packets arrive too slow, interpolation will happen because the
buffer becomes too small.

Note that there is no explicit buffer size, the cases 1) and 2) above
are determined only based on the histogram of the packet arrival time. I
don't even attempt to know whether 1) the network delay is changing 2)
the clocks are drifting. It would be impossible anyway without an
accurate clock.

>> The only that that really doesn't like it is the echo canceller.
> In my above case, If I add 10 extra packets regularly in the incoming
> stream (the one that miss 10 packets), the echo canceller is working
> perfectly.

Well, if your local capture clock and your local playback clock are
synchronised it's fine. The AEC doesn't care if the remote clock is

> I was just trying to comment on the paper you linked to: My opinion is
> that the problem don't only comes from local hardware (where non-synchro
> clocks leads to problem with aec). There are other problems with different
> clocks on 2 remote hardware. (where non-synchro does not lead to aec
> issue, but leads to missing data (sometimes no data is played) or too
> much data (the application has to discard else the voice delay is
> growing because a buffer is growing)

As long as the clock drift is small (typically <1%), I don't see any
problem from a jitter buffer point of view.

> The only way would be to extend or reduce frames: so my question was:
> does anybody here have ever tried this in real time on audio streaming?
> Any simple idea to do this?

You can try the jitter buffer I have. The only thing that I need to
improve is to wait for silence periods before adding/removing packets.


More information about the Speex-dev mailing list