[Speex-dev] New needs for VoiceAI preprocessing by old tools

Stuart Naylor stuartiannaylor at outlook.com
Thu Aug 20 07:07:47 UTC 2020


Xiph has done some really good work but since Speex things have seemed quite quiet.
Devs to hobbyists have a huge battle with VoiceAI audio processing and even when available extremely similar routines are separated across libs so that much process duplication occurs and increases load.

The Alsa-plugins of Speex echo just don’t seem to work whilst repos using the speex libs do.

I actually use the speex AGC Alsa-plugin as its a great tool and then run Echo via the above.
Coupled to WebRTC_VAD and a python based MFCC routine.
Beamforming and DOA I have given up hope for as I am running so many extremely similar FFT routines concurrently in different process space that my load usage makes me believe there isn’t a chance.

All of them have extremely similar frame based spectra analysis collection that collate and process over different windows for different process but much of the input process are almost identical heavy load FFT process that for some reason we duplicate because the Linux landscape is one of scattered audio processing libs.

Linux currently has much work in terms of open source ASR, TTS, TensorFlow frameworks but what we have available in terms of audio pre-processing are scattered individual segments of the audio chain and create a seriously inefficient chain.
These are old tools but the whole chain to MFCC output just doesn’t exist.


Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/speex-dev/attachments/20200820/09ea3349/attachment.html>

More information about the Speex-dev mailing list