[Speex-dev] VAD with speex_preprocess()
Tom Grandgent
tgrand at canvaslink.com
Tue Mar 8 08:42:53 PST 2005
As I understand it, there are two separate ways to get VAD information
from Speex: 1) Using the encoder. 2) Using speex_preprocess(). I
present the following observations from an application developer's
perspective. They may be wrong, in which case I would appreciate
corrections.
- The two VAD systems are implemented differently.
- speex_preprocess()'s VAD provides more accurate detection than the
encoder's VAD at the cost of more CPU usage.
- speex_preprocess()'s VAD is affected by the AGC and/or denoise state
more directly than the encoder's VAD.
- Possibly as a result of the previous point, speex_preprocess()'s VAD
can get into a bad state, given an input that varies drastically in
amplitude/behavior, after which point its accuracy is ruined and the
only solution is to destroy/recreate the preprocess state.
Tom
"Paul Gryting" <paul.gryting at teligy.com> wrote:
>
> In speexenc.c, speex_preprocess() is not called unless AGC or denoise is
> enabled.
> If only VAD is enabled, it does not get called.
>
> speex_preprocess() has vad_enabled specific code to detect voice activity.
> speex_preprocess()
> {
> ...
> ...
> if (st->vad_enabled)
> is_speech = speex_compute_vad(st, ps, mean_prior, mean_post);
>
> ...
> ...
> return is_speech;
> }
>
> Some questions for the knowledgable:
> Is speex_preprocess() needed to use vad?
>
> Can speex_preprocess() be used to detect silent frames if vad is enabled,
> but not agc or denoise?
> What internally does speex do differntly for silent frames when VAD is
> enabled?
>
>
> Paul
More information about the Speex-dev
mailing list