[vorbis-dev] Lossless compression

Sun Aug 15 21:37:18 PDT 1999

> I was just somewhat curious as to what lossless compression is used... is
> it just huffman?

"I'm getting to that part" :-)  Let me answer way more than you asked...

The back-end encoder for Vorbis is codeword based;  an n-bit number is mapped
to an m-bit number (or vector of numbers).  Compression works when n is
generally less than m.  Vorbis allows the codebook to be packed into the
bitstream header.  Thus the encoder is allowed to decide what special case of
the geenral codeword mechanism to use.

With a little thinking, you can see that huffman coding is just a specific case of using codewords where:

1) the encode and decode mappings are 1->1 (that is, every possible input value
has a corresponding unique output value and vice versa).  

2) Usually the input to a huffman coding is a scalar.

3) The encoded words are of variable length.  The huffman codebook generally
comprises a codeword tree optimized to representing the average data set in the
minimum integer number of bits.

If fact, this is not so different from VQ (vector quantization); one simply plugs different numbers in.  In VQ:

1) Encoding is usually N->1; that is, groups of similar input values are represented by a single codeword.

2) Input to VQ encoding is usually a small vector (N values instead of one).

3) The encoded words are usually a fixed length.

When you build the encoding/decoding engine, you can end up with an engine that
handles both (and several actually more useful variants in between) if you pay
attention the the old CS maxim, "There are only three numbers allowed in
programming.  0, 1 and N."

(This is glosing over alot of implementation details of course).

Monty

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/