[Speex-dev] Increasing the speed of speex playback
waywardgeek at gmail.com
Tue Oct 19 19:50:10 PDT 2010
Hi, Steve. I agree with what you've said. I'm interested in large
speed changes in speech speed, beyond 2X speed up. I personally
listen to books at around 3.2X speed, though my speed is considered
slow by some blind hackers. In my experience so far (which is
limited), the fundamental problem of speeding up speech beyond about
2X is that glottal pulse events determine the pitch and cannot change
rate without distorting the voice, while other aspects of speech need
to change proportionally to speech speed. This is why I've been
looking into LPC based algorithms, which extract the glottal
excitation from the voice signal and resynthesise it in the decoder.
So far, I've had much better luck with LPC based algorithms than
short-time FTP based algorithms for large speed up factors.
That's all I have for now... I'm looking forward to learning about
the PICOLA algorithm. I'll read as much as I can find on it tomorrow.
On Tue, Oct 19, 2010 at 9:37 PM, Steve Underwood <steveu at coppice.org> wrote:
> Hi Bill,
> Any attempt to alter speed by simple insert or dropping produces poor
> results. Even if you can get it to sound smooth, the resulting pitch
> shift is horrible. You really need to use a transform that alters speed
> smoothly, while maintaining the original pitch of the voice. If you look
> in my spandsp library you will find a module which does exactly this,
> using an algorithm called PICOLA. You can speed up or slow down a voice
> in fine speed steps using this module, and the resulting voice is almost
> the same quality as the original. There is a test program for it, which
> should function as an example of how you need to call the library to
> initialise and use it.
> On 10/20/2010 05:21 AM, Bill Cox wrote:
>> Here's one clue about whatever is causing the low quality speech.
>> Speech sounds terrible at 1.01X faster, and it sounds excellent at
>> normal speed (1.0X). So, the main problem is something that breaks
>> with any change in frame size in the decoder. Any idea what that
>> might be?
>> On Tue, Oct 19, 2010 at 5:14 PM, Bill Cox<waywardgeek at gmail.com> wrote:
>>> I was able to easily hack in an option to play back at different
>>> speeds. For example, using "speexdec --speed 2.0 file.enc file.wav"
>>> plays back encoded file.enc at 2X speed. What I did was divide
>>> st->frameSize and st->subFrameSize by the speedup, and added a
>>> SPEEX_SET_SPEED decoder control for the nb_celp decoder. This
>>> produced speech that was 2X faster than the original.
>>> However, the quality is very poor. This is where it gets harder for
>>> me, as the quality is impacted by so many parts of the code. Can
>>> anyone guess which part of the decoder is leading to such poor quality
>>> when I cut the frame size in half? This hack works very well in
>>> LPC10, and fairly well in MELPe.
>>> I've attached two outputs from speex: the decoded playback at normal
>>> speed, and the 2X speed version.
>> Speex-dev mailing list
>> Speex-dev at xiph.org
> Speex-dev mailing list
> Speex-dev at xiph.org
More information about the Speex-dev