[vorbis-dev] Using the vorbis psychoacoustics model

Fri Jul 7 16:30:43 PDT 2000

> From: Takehiro Tominaga <tominaga at isoternet.org>
> Content-Type: Text/Plain; charset=us-ascii
> Date: Thu, 06 Jul 2000 09:35:06 +0900
> X-Dispatcher: imput version 20000228(IM140)
> Sender: owner-vorbis-dev at xiph.org
> Precedence: bulk
> Reply-To: vorbis-dev at xiph.org
> X-UIDL: BX0"!Rlg!!A04"!Oi%#!
> X-Status: 
> X-Keywords: 
> X-UID: 1916
> 
> >>>>> "N" == Nyarlathotep  <mythos at zxmail.com> writes:
> 
>     N> mmmm Why aren't you using lames Gpsycho???
> 
> because a part of GPSYCHO is patented.
> ---
> tominaga at isoternet.org // may the source be with you!
> 

I believe the only part of GPSYCHO that is patented is the formula for
extimating the tonalty (for determining if a masker is noise-like or
tonal).  You need this information to determine the amount of masking,
since a noise like signal generates more masking.

gpsycho is also a partition-band model.  That is, it computes
maskings in (about) 64 bands, each about 3.5 barks wide.  I beleive
Vorbis does the full "line-by-line calculation" (it computes maskings for
every single spectral line/cofficient).  But the energies/maskings of
individual MDCT coefficients only make sense when some kind of
averaging/smoothing is done, so this is probably not an advantage.

Another difference: IIRC, Vorbis has a hand crafted spreading function
which Monty derived from the actual psycho acoustic data.  gpsycho
just uses a simple formula coming from the ISO AAC docs.

Mark

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/