[Speex-dev] preprocessor VAD only rocognize between silence andnot silence
Tom Grandgent
tgrand at canvaslink.com
Mon Dec 15 07:32:16 PST 2008
Jesus,
Unfortunately, FFT and magic algorithms don't work (yet?). You
might want to try this if you're not satisfied with Speex VAD:
http://lists.xiph.org/pipermail/speex-dev/2008-August/006860.html
It won't perform any miracles, but I think it works pretty well
and is easy to tweak.
Tom
>---- Original Message ----
>From: jmorion at toomeeting.com
>To: speex-dev at xiph.org
>Subject: Re: [Speex-dev] preprocessor VAD only rocognize between
>silence andnot silence
>Date: Mon, 15 Dec 2008 12:41:53 +0100
>
>>Hi, i would like if someone has experienced the same problem and if
>im
>>using the preprocessor VAD correctly.
>>
>>A Voice Aactivity Detector is spected to detect human voice (using
>FFT
>>and magic algorithms) but it only works as an Activity Detector,
>doesnt
>>difference between voice and knoking the table.
>>
>>Is the VAD performance a tabu theme?
>>
>>Thank you.
>>
>>
>>
>>jesus escribió:
>>> Hello,
>>>
>>> in my project im using speex 1.2rc1 and the preprocessor VAD seems
>to
>>> only separate complete silence from not complete silence frames.
>>>
>>> The Speex Manual, you can read "The voice activity detector (VAD)
>>> provided by the preprocessor is more advanced than the one
>directly
>>> provided in the codec."
>>>
>>> but if you go to the source code in preprocess.c line 995 "/*
>FIXME:
>>> This VAD is a kludge */"
>>>
>>> I've seem in te roadmap that you are testing a new VAD, but Im
>not sure
>>> if im doing something wrong with the current.
>>>
>>> Here is the code of the compressor:
>>>
>>>
>>>
>>> if (inicializado == false)
>>> {
>>> bits = new SpeexBits;
>>> speex_bits_init(bits);
>>> enc_state = speex_encoder_init(&speex_nb_mode);
>>>
>>> // obtenemos tamaño frame
>>> speex_encoder_ctl(enc_state, SPEEX_GET_FRAME_SIZE,
>&frame_size);
>>>
>>> // configuramos parametros
>>> int complexity = 5;
>>> speex_encoder_ctl(enc_state, SPEEX_SET_COMPLEXITY,
>&complexity);
>>>
>>> int samplingrate = 8000;
>>> speex_encoder_ctl(enc_state, SPEEX_SET_SAMPLING_RATE,
>>> &samplingrate);
>>>
>>> int quality = 8; //Calidad relativa de 0 a 10
>>> speex_encoder_ctl(enc_state, SPEEX_SET_QUALITY, &quality);
>>>
>>> int dtx = 0;
>>> speex_encoder_ctl(enc_state, SPEEX_SET_DTX, &dtx);
>>>
>>> int vbr = 0;
>>> speex_encoder_ctl(enc_state, SPEEX_SET_VBR, &vbr);
>>>
>>>
>>> // PREPROCESADOR
>>>
>>> pre_state = speex_preprocess_state_init(frame_size,
>samplingrate);
>>>
>>> int denoise = 1;
>>> speex_preprocess_ctl(pre_state,
>SPEEX_PREPROCESS_SET_DENOISE,
>>> &denoise);
>>>
>>> int pvad = 1;
>>> speex_preprocess_ctl(pre_state, SPEEX_PREPROCESS_SET_VAD,
>&pvad);
>>>
>>> int agc = 1;
>>> speex_preprocess_ctl(pre_state, SPEEX_PREPROCESS_SET_AGC,
>&agc);
>>>
>>> inicializado = true;
>>> }
>>>
>>> __try {
>>> tdestino = 0;
>>> int frame_size;
>>> speex_encoder_ctl(enc_state, SPEEX_GET_FRAME_SIZE,
>&frame_size);
>>>
>>> int nbloques_sample = this->torigen/frame_size/2;
>>>
>>> speex_bits_reset(bits);
>>>
>>> int voces = 0;
>>> for(int bloque=0;bloque<nbloques_sample;bloque++)
>>> {
>>> // como el preprocesador devuelve 1 o 0 si es voz o no
>(por
>>> el VAD), lo uso
>>> // para saber si es silencio
>>> voces += speex_preprocess(pre_state,
>>> ((short*)origen+(bloque*frame_size)), NULL);
>>>
>>> // introducimos los datos a comprimir
>>> int e = speex_encode_int(enc_state,
>>> ((short*)origen+(bloque*frame_size)), bits);
>>> }
>>>
>>> // extraemos la informacion comprimida
>>> tdestino = speex_bits_write(bits, destino, 20000);
>>> if(voces == 0) // no habia ningun frame de voz en el
>bloque
>>> es_silencio = true;
>>> else
>>> es_silencio = false;
>>>
>>>
>>> Thank you.
>>>
>>> _______________________________________________
>>> Speex-dev mailing list
>>> Speex-dev at xiph.org
>>> http://lists.xiph.org/mailman/listinfo/speex-dev
>>>
>>>
>>
>>_______________________________________________
>>Speex-dev mailing list
>>Speex-dev at xiph.org
>>http://lists.xiph.org/mailman/listinfo/speex-dev
>>
More information about the Speex-dev
mailing list