[speex-dev] Psycho Acoustic models i Speech Coding
jean-marc.valin at hermes.usherb.ca
Tue Jun 3 08:42:42 PDT 2003
I've heard that some people had tried some psychoacoustic models on CELP
and didn't get much improvement for the complexity it added. I'd still
like to try the Vorbis model in Speex at some point (anyone interested
in doing that?) see if there enough improvement to make it optional.
About modifying the LP filter without making it unstable, I have a
method that usually works quite well: transform the filter to the
spectral domain (frequency response of 1/A(z)), modify in frequency
domain, transform back to LP filter coefficients. Transforming filter
coef to spectral domain is as easy as 1/||fft(A(z)||^2 (using
zero-padding to get enough resolution). To get back to filter
coefficients, you first take the inverse Fourier transform of the
spectral density, so you get the auto-correlation and then apply
Levinson-Durbin to get the new filter coef.
Le mar 03/06/2003 à 05:18, Tom Backstrom a écrit :
> (This is almost out of topic but anyway...)
> It is surprising how little research effort have been put into psy-acou
> models for CELP. The basic problem lies in that it is not easy to alter
> the LP model without distroying the minimum-phase property (ie. the
> stability of the predictor). That leaves us with psy-acou modelling of
> the noise-part only. However, my own research is in constrained
> LP-models that addresses just this problem. Unfortunately, applications
> are still at least a couple of years away.
> Christian Buchner wrote:
> > > Does anyone have an idea about the possibility to apply psychoacoustic
> > > models as the ones in mp3 or AAC to a CELP coder? Thanks!
> > > /Pontus
> > This is (sort of) done in the decoder with the optional perceptual
> > filtering. Speex tries to shape the noise so that it sounds more
> > pleasant, I believe.
> > To really use perceptual coding, one would require fine granular
> > control over quantization noise in various spectral bands. Speex
> > is not a subband coder and has no such control. But it already does
> > a better job than MP3 and other codecs at these bitrates without
> > using a PSY model.
> > Besides those PSY models are mostly FFT based which would greatly
> > increase the encoder complity. We don't really want that. Do we ;)
> > Christian
> > --- >8 ----
> > List archives: http://www.xiph.org/archives/
> > Ogg project homepage: http://www.xiph.org/ogg/
> > To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
> > containing only the word 'unsubscribe' in the body. No subject is needed.
> > Unsubscribe messages sent to the list will be ignored/filtered.
Jean-Marc Valin, M.Sc.A.
Université de Sherbrooke, Québec, Canada
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 190 bytes
Desc: Ceci est une partie de message numériquement signée
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20030603/5cb14a37/signature-0001.pgp
More information about the Speex-dev