[Speex-dev] Backup Echo Suppression
zmorris at mac.com
zmorris at mac.com
Mon Jul 2 23:45:04 PDT 2007
On Jul 2, 2007, at 9:48 PM, Jean-Marc Valin wrote:
> Selon zmorris at mac.com:
>> But if the echo canceler IS using each frame's timestamp when it's
>> trying to converge, it's almost guaranteed to fail on most operating
>> systems, because the timestamp has such a high variability between
>> frames, and can even sometimes be 0 for the output buffer in this
>> example.
>
> Don't know what you mean about timestamps. the AEC doesn't use/need
> timestamps.
> But it does require you send the audio in the same order you
> capture/play it.
I just mean, if the echo canceler is using timing information to try
to find the echo, then it probably won't work, but if it just works
on each pair of buffers, then it will, and it sounds like that is how
it works, which is good to know, thanx.
>> Also, I think that many machines have separate input/output hardware
>> that can suffer from clock drift. I'd really like to see an echo
>> canceler that can work even when input/output frames are fed in with
>> a large random time delta. I should be able to skip the first few
>> input or output frames, and the echo canceller should be able to find
>> out what the time delta is, and know from that point on, it will be
>> relatively constant between any given pair of input/output frames.
>
> This is a lot harder than you may think. estimating the drift
> accurately enough
> is highly non-trivial. It's much easier to make sure the clocks are
> in sync
> (e.g. tell the user to use the same card for both).
Unfortunately, I think that with things like Garage Band introducing
the world to electronic music, we're going to see more and more
strange configurations with third party sound cards, which never used
to be an issue on the Mac. As a shareware game designer, I have to
cope with the fact that people only tolerate about 3 seconds of fuss
before they toss my game in the garbage.
So even though it is difficult, I think that echo canceler 2.0 should
be able to tolerate things like multiple audio sources (5.1 etc),
even multiple mics, from multiple sound cards. This is probably
beyond the scope of speex, but it's going to become more of an issue
as people want high fidelity video chat/telepresence that "just
works". I haven't read the manual you suggested, and I haven't even
tried the newest speex beta yet, so maybe all of this isn't needed.
>> The easiest way to do this might be to look at the maximum of the
>> covariance of the input/output, or find the phase offset of the input
>> and output FFTs. Maybe it already does this, and someone can say
>> if so?
>
> If you think it's easy, then I guess I'll be waiting for your patch...
Hah ya blah. One way to go about exploring something like that would
be in something like Matlab that makes brainstorming easy. The
covariance thing is actually only 1 line of code in Matlab, but it
can be intensive. It would be better to use FFTs from elsewhere in
speex if you have them, but I dunno how your underlying
implementation works.
>> P.S. The above situation is almost exactly what happens on my Mac,
>> and would be exacerbated by people with third party sound cards.
>
> You mean Apple can't ship a soundcard that records and plays at the
> same rate? I
> have a hard time believing that.
It is actually very likely that input and output are on the same card
at 44100 Hz. What's problematic is getting the right sized hardware
buffers. I wrote a whole library to nibble and give me the 160 I
need, but I have no idea what is happening under the hood, or if I
can even ask apple for the same size input and output buffers,
because I am using their classic sound engine which is archaic by
today's standards. I could probably switch to using core audio or
quicktime to get the right buffers, it's just a lot of work. It
sounds like all of that isn't necessary though, if as you said above,
nibbling won't affect things because it doesn't use timestamps.
Thanx for the info,
--Zack
More information about the Speex-dev
mailing list