[Speex-dev] Increasing the speed of speex playback

Bill Cox waywardgeek at gmail.com
Wed Oct 20 06:20:17 PDT 2010


Hi, Steve.  I tried your the time_scale_tests program, and it works
well!  Especially for low speed changes, it's the best I've heard so
far.

For high speed increases, there is what sounds like static added to
the sound output.  I've attached two sound samples of high speed
speech, which is a 4X speed up of a popular TTS voice in the blind
community (voxin/Eloquence).  I've sped up the voice with LPC in one
case, and time_scale_tests in the other.  Don't worry that you can't
understand these speech samples - many blind people can, and I can
understand it, just barely, at this speed.

I guess now I need to learn about the algorithm you've used, and see
if I can track down the source of the static.  I've copied two lists
that have blind users who may be interested in very high speed
playback of voices.

Bill

On Tue, Oct 19, 2010 at 9:37 PM, Steve Underwood <steveu at coppice.org> wrote:
>  Hi Bill,
>
> Any attempt to alter speed by simple insert or dropping produces poor
> results. Even if you can get it to sound smooth, the resulting pitch
> shift is horrible. You really need to use a transform that alters speed
> smoothly, while maintaining the original pitch of the voice. If you look
> in my spandsp library you will find a module which does exactly this,
> using an algorithm called PICOLA. You can speed up or slow down a voice
> in fine speed steps using this module, and the resulting voice is almost
> the same quality as the original. There is a test program for it, which
> should function as an example of how you need to call the library to
> initialise and use it.
>
> Steve
>
>
> On 10/20/2010 05:21 AM, Bill Cox wrote:
>> Here's one clue about whatever is causing the low quality speech.
>> Speech sounds terrible at 1.01X faster, and it sounds excellent at
>> normal speed (1.0X).  So, the main problem is something that breaks
>> with any change in frame size in the decoder.  Any idea what that
>> might be?
>>
>> Thanks,
>> Bill
>>
>> On Tue, Oct 19, 2010 at 5:14 PM, Bill  Cox<waywardgeek at gmail.com>  wrote:
>>> I was able to easily hack in an option to play back at different
>>> speeds.  For example, using "speexdec --speed 2.0 file.enc file.wav"
>>> plays back encoded file.enc at 2X speed.  What I did was divide
>>> st->frameSize and st->subFrameSize by the speedup, and added a
>>> SPEEX_SET_SPEED decoder control for the nb_celp decoder.  This
>>> produced speech that was 2X faster than the original.
>>>
>>> However, the quality is very poor.  This is where it gets harder for
>>> me, as the quality is impacted by so many parts of the code.  Can
>>> anyone guess which part of the decoder is leading to such poor quality
>>> when I cut the frame size in half?  This hack works very well in
>>> LPC10, and fairly well in MELPe.
>>>
>>> I've attached two outputs from speex: the decoded playback at normal
>>> speed, and the 2X speed version.
>>>
>>> Thanks,
>>> Bill
>>>
>> _______________________________________________
>> Speex-dev mailing list
>> Speex-dev at xiph.org
>> http://lists.xiph.org/mailman/listinfo/speex-dev
>>
>
> _______________________________________________
> Speex-dev mailing list
> Speex-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test1_4x_lpc.ogg
Type: audio/ogg
Size: 25076 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20101020/9382097d/attachment-0002.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test1_4x_time_scale.ogg
Type: audio/ogg
Size: 25396 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20101020/9382097d/attachment-0003.bin 


More information about the Speex-dev mailing list