[speex-dev] Removing silence at the start and end of sample encoded

Asger Kunuk Alstrup alstrup at sophusmedical.dk
Fri Jan 9 14:37:46 PST 2004



Hi,

Speex is great!

We are using it to compress hundreds of megabytes of speech for use in our
application that trains people in resuscitation. The previous version of our
product used Ogg Vorbis, but after switching to Speex, we achieve fantastic
compression, while retaining super quality. That allows us to cram more
translated versions of the software onto each CD-ROM, making everything better.

However, we have a small feature requests for speexenc that would make things
perfect for us: Support for trimming the start and end of the sound away, during
the encoding process.

This feature request consists of two things:

1) support for automatic detection and trimming of silence at the start and end
of the sound out.

2) support for intentionally skipping the very first and very last parts of the
input WAV, even if it is not silence.

This is because we record a lot of small sentences using a recording tool on the
PC - we are talking hundreds of sentences. Every time the actor records a sound,
he has to press space first on the keyboard, wait a bit, read the text, and then
click space again to stop recording when he is done reading it. Now, we can
often hear the release of the space key at the start of the sample, and the
pressing of space bar at the end of the sample.

So, our samples all look like this:

  "Noise---Silence---Noise-speech-with-wrong-gain---Silence---Noise"

We would like to automatically get this after feeding the WAVE file to speexenc:

  "Speech-without-noise-and-correct-gain"

with the leading noise and silence trimmed out. Speexenc already supports
automatic gain control and noise reduction, so all we need is the trimming of
the start and end, of both the noise and silence. Of course, the silence part
varies in length from sample to sample, but we can cap the noise to X
milliseconds.

As it is today, we have to manually edit these samples to remove the noise, and
then Cooledit can batch-process the silence out for us. After that, we use
speexenc to get the final sample, but it should be possible to do this in one
operation, making everything simpler.

If you can not implement this for us, maybe you could give a few hints to where
we should look to implement this feature? We are using the CVS version of Speex.

Thanks in advance,

Asger Ottar Alstrup

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Speex-dev mailing list