[Speex-dev] PLC in Speex-

Thu Oct 28 16:32:20 PDT 2010

I've been working on improving packet loss concealment in my VoIP client --
I hadn't actually realized speex performed some PLC until I was looking at
nb_decode().

There are two things I'd like to do, and I think both will require some
modification to nb_decode() and nb_decode_lost().

- I'd like to have my frames overlap by some number of samples.  This is
intended to reduce the amount of fill/stretch that has to happen when a
packet is missing.  This can probably be an entire subframe.
- I'd like to have nb_decode_lost() use information about future packets
that may be waiting, if they are available (so that the signal blends into,
and stretches out, the next packet's signal to fill the gap, rather than
just extending the last packet and then transitioning to the next packet
when decoding that packet)

Before I dive into this, I thought I'd do a sanity check with this list.

I suspect getting frame overlap to work internally to Speex will not be too
difficult.  If I shift both the current subframe's excitation and the
excitation buffer from the last subframe by N samples (where N is overlap)
and start the IIR filter N samples into the first subframe, I think this
will work smoothly?  I'm assuming that the discarded part of the excitation
is roughly similar to the last segment of the previous subframe's
excitation, because they were generated from the same raw samples?  If I
overlap by an entire subframe this all gets a lot easier I suppose.  Maybe
that's what I should do.

Filling a single subframe gap where you have data for both sides of the gap
will be more difficult.  My idea had been to just average the old and new
excitations and interpolate the LPC parameters over the gap.  This seems
like maybe a bad idea if you end up interpolating a vowel excitation with
consonant excitation - it seems like maybe interpolating excitations is not
going to produce good results, in general.  Perhaps the thing to do is to
use only the adaptive codebook excitation and interpolate LPC parameters?
It may also be the case that this really doesn't improve audio quality much
versus the current nb_decode_lost() implementation (all that would change is
that the LPC parameters and pitch gains are interpolated with the next
packet, instead of just duplicating the previous subframe)

Finally, I wonder if there's a way to do time stretching in a coherent way
inside of speex, or if this needs to happen to the output signal as a
post-processing step.  It seems like there's not good way to extend the
codebook excitation signal in time?  But I don't understand where the
codebook comes from in the first place, so maybe it's possible to regenerate
longer versions of each excitation signal?

Stuart
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20101028/b6c4b922/attachment.htm