[vorbis-dev] Monty on holiday

Tue Jun 6 12:07:29 PDT 2000

On Tue, 6 Jun 2000, Willmore, David (VS Central) wrote:

> Okay, before I say anything too stupid.  Could someone point me to a
> reference for how the current VQ scheme works.  (*mumble* or maybe what VQ
> even means.....)

Quantazation (at least in this context) would be the process of
representing information more corsely then orignal.  I.e. 1.435 could be
quantized to 1.  In this process, information is typically lost.

Vector quantization is simmlar but insted of one value, we operate on
agroup of values.

Quantazation can be performed using a codebook. This is where you have a
table of values, and you return an index to the closest match. I.e.

If you have a dataset of

1.5 2.5
2.5 6.0
2.0 6.0
2.0 2.5

and a codebook of 

1.75 2.5
2.25 6.0

You would then code the data

0
1
1
0

In vorbis, we also have the ability to 'cascade books', i.e. take the
error from one set and emit more correction words (either multiplicative
or additive).

The codebooks are packed at the begining of the ogg file, and could differ
for differnt songs.

The codeword legnth is variable, and is the result of a huffman tree
created by counting hits against a test set (i.e. if all codebook entries
are equaly likely, they will have equal codeword legnths.

The codebook used to store the noise floor LSPs is trained to minimize
global error while the residue books are trained with a much simpler
scheme (because the other method did not preserve uncommon features).

The LSP output looks like

0.012 0.234 0.543 0.7328 0.9243 1.0234 1.235 1.5234
(Always increasing; It's a property of LSP)

The current vector codebooks are four wide (I believe), while long block
LSPs are 32 vaules wide. The LSP is broken into subvectors for encoding,
each the same legnith as the vector codebook input.

The value of the last entry is subtracted from all the members of the next
word.. I.e. the above becomes

0.012 0.234 0.543 0.7328
.1915 .2906 .5022 .7906

This is what the codebook is trained against and encodes. This allows you
to keep the codebook size small but still accuratly represent the data.

The residue codebook is a bit differnt. It uses a creative 
Amplitude&Entropy metric to segment the residue into 64 entry groups which
are encoded by differnt books depending on their amplitude and entropy.

Enough information? :P

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/