[Speex-dev] Increasing the speed of speex playback

Bill Cox waywardgeek at gmail.com
Tue Oct 19 20:11:20 PDT 2010


Hi, Jean-Marc, and thanks for the quick reply.  Let me just say I'm a
huge fan of speex, and the work you've done.  I actually barely
understand what I'm reading so far in the source code and
documentation, just enough to understand just how cool the algorithms
are.

LPC10 and MELP allow me to speed up speech with a simple hack on the
decoder frame size.  Playing fewer samples per frame speeds up the
speech, without effecting the excitation.  It works well, but not as
well as I would like.  I've attached a sample of a female voice sped
up with MELPe.  I fully understand basic LPC10.  Simply reducing the
frame size in the decoder is exactly the right way to speed up LPC10
speech without changing the pitch. I would like to figure out how to
apply some of the innovations in CELP to sped up speech.  Frankly,
this is the limit of my current knowledge, and I am clueless as to how
to apply CELP concepts to high speed playback.

Bill

2010/10/19 Jean-Marc Valin <Jean-Marc.Valin at usherbrooke.ca>:
> You're asking the wrong question. The question is not "why does it would bad
> with Speex?", but "why does it sound good with LPC10 and MELP?". And the
> answer is that both are vocoders. Try dropping frames/subframes with
> anything else (Vorbis, MP3, G.729, u-law, ...) and it'll sound terrible as
> well. The only reason it sounds good with vocoders is because the codec
> parameters are in fact synthesizer parameters that don't have a direct
> connection with the signal.
>
>   Jean-Marc
>
> Bill Cox <waywardgeek at gmail.com> a écrit :
>
>> I was able to easily hack in an option to play back at different
>> speeds.  For example, using "speexdec --speed 2.0 file.enc file.wav"
>> plays back encoded file.enc at 2X speed.  What I did was divide
>> st->frameSize and st->subFrameSize by the speedup, and added a
>> SPEEX_SET_SPEED decoder control for the nb_celp decoder.  This
>> produced speech that was 2X faster than the original.
>>
>> However, the quality is very poor.  This is where it gets harder for
>> me, as the quality is impacted by so many parts of the code.  Can
>> anyone guess which part of the decoder is leading to such poor quality
>> when I cut the frame size in half?  This hack works very well in
>> LPC10, and fairly well in MELPe.
>>
>> I've attached two outputs from speex: the decoded playback at normal
>> speed, and the 2X speed version.
>>
>> Thanks,
>> Bill
>>
>
>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: f2x.ogg
Type: audio/ogg
Size: 34216 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20101019/06ebbb7c/attachment-0001.bin 


More information about the Speex-dev mailing list