[Speex-dev] VAD Questions

Jean-Marc Valin jean-marc.valin at usherbrooke.ca
Thu Jun 7 16:50:14 PDT 2007


> - Is there a reference somewhere (other than the source itself) that
> explains how the latest VAD algorithm works?

Read the source, Luke :-) (sorry)

> - Is it possible to obtain the VAD status of a Speex stream
> asynchronously? The current API seems to imply that some kind of
> polling is required to determine the voice/non-voice status.

Don't understand your question. Also which VAD are you talking about?
The one in the encoder or the one in the preprocessor?

> - Does the VAD algorithm implement syllabic/sonorant rate detection,
> as has been implemented many times in analog circuitry, and is
> described in this (and other) papers?
> http://people.csail.mit.edu/jrg/2005/IS05_schutte.pdf

As far as I understand, the paper you reference above isn't applicable
to the problem here. Basically, we have to decide whether we have speech
or silence based only on 20 ms of audio (and the past). If we could
"look into the future" of the signals, things would be much easier.

> - Over what time period is VAD done? Is it done on a frame by frame
> basis or over some longer period?

It *has* to be done frame by frame, otherwise you add latency, which
isn't acceptable.

	Jean-Marc


More information about the Speex-dev mailing list