[Speex-dev] Voice activity detection
Jean-Marc Valin
jean-marc.valin at usherbrooke.ca
Fri Feb 15 13:09:41 PST 2008
> Anyway, my request is, can you build in a pre and post buffer into the
> VAD? In mine, if I detect voice any time between now and say a quarter
> second later, I start sending, and then I wait a half second or whatever
> after I stop detecting. You pretty much have to have this, or people
> start getting anxious talking over an internet stream. They have to
> enunciate expressions like "ya probably" because the ya isn't detected,
> only the probably. By sending a bit of padding around the detection, it
> also prevents the detector from dropping out mid-sentence. It takes it
> from being a screaming contest over a walkie talkie, to a normal
> telephone conversation.
>
> You might be reluctant to do this, because you have to add in some state
> information instead of just focusing on the current buffer, but the
> quality improvement is enormous. I'd just like to be able to pass a pre
> and post value to the VAD in milliseconds, defaulting to either 0 or
> values similar to what I quoted above. And I realize this can add some
> delay, but even detecting a single extra syllable makes a world of
> difference.
If you like to buffer speech, just do it. There's no reason you need to
have the buffer in the VAD itself.
Jean-Marc
More information about the Speex-dev
mailing list