[vorbis-dev] Transient coding: AAC vs. Vorbis

Wed Jun 2 10:45:53 PDT 2004

Thread-split from the vorbis-mailing list
("Vorbis determined to be as good as MPC at 128 kbps!")

<p>On Sun, 30 May 2004, Segher Boessenkool wrote:

[Steven So]
SS>> If iTunes AAC can encode castanets with much less pre-echo at
SS>> ABR 128 kbps, then hopefully there will be an imaginative
SS>> (and non-patented) way of doing this in Vorbis without the
SS>> bitrate inflation of GTune and QKTune.

[Segher Boessenkool]
SB> Use some different transform?  MDCT isn't the best audio transform
SB> ever invented, esp. not for non-steady waveforms.

Steven is talking about Vorbis, Segher.
Vorbis makes use of the MDCT.

<p>Let's see... Vorbis I versus AAC in transient coding...
(simplified ASCII art following)

audio wave  ('-'=low volume, <!>=transient )
--------------------------<!>--------------------

AAC
+---------------+---------------+---------------+
|       1       |       2       |       3       | frame no.
+---------------+-+-+-+-+-+-+-+-+---------------+
|       L       |S|S|S|S|S|S|S|S|       L       | transform
+---------------+-+-+-+-+-+-+-+-+---------------+
|       A       |    B    |C| D |       E       | scalefactor sets
+---------------+---------+-+---+---------------+

Vorbis I
+---------------+-+-+-+-+-+-+---------------+----
|       1       |2|3|4|5|6|7|       8       |     packet no.
+---------------+-+-+-+-+-+-+---------------+----
|       L       |S|S|S|S|S|S|       L       |     transform
+---------------+-+-+-+-+-+-+---------------+----
|       F       |G|H|I|J|K|L|       M       |     floor curves
+---------------+-+-+-+-+-+-+---------------+----

Vorbis II (proposal, see below)
+---------------+---------+-+---------------+----
|       1       |    2    |3|       4       |     packet no.
+---------------+-+-+-+-+-+-+---------------+----
|       L       |S|S|S|S|S|S|       L       |     transform
+---------------+-+-+-+-+-+-+---------------+----
|       N       |    O    |P|       Q       |     floor curves
+---------------+---------+-+---------------+----

L   = long transform
S   = short transform
A-E = sets of scalefactors (AAC)
F-N = floor curves (Vorbis I)
M-Q = floor curves (Vorbis II)

Obviously Vorbis I is wasting space in this example by
coding 5 floor curves (G-K) that are very similar.
AAC *shares* the scalefactor set B with these 5 windows
thus saving space.

Vorbis II could allow the storage of multiple 'short'
MDCT spectra (maximal blocksize1/blocksize0 many)
into one packet that share ONE floor curve.

It maybe also worth the effort to encode the channel's
residue vectors as one big vector (per channel) by
interleaving. I think this will also improve coding
efficiency a bit. As a side effect there will be the need
for moreresidue configurations since the size of the residue
vectors can be 1*128, 2*128, 3*128, ..., 7*128 and 8*128=1024.

<p>Back to Vorbis I:

What can be done to minimize pre-echos without increasing
bitrate that much ? How about temporal noise shaping ?
"Impossible!", you may say. Well, TNS is not a buil-in
Vorbis feature like in AAC. But it doesn't HAVE to.

TNS can be done either by coding the MDCT spectrum by
1) LPC-Filter + quantized LPC residual
OR
2) using an NSQ (noise shaping quantizer)

The AAC format allows method 1. But Method 2 could be done
for both (Vorbis and AAC) without breaking compatibility.
In fact, method 2 is used by MPC in the time domain to
shape the quantization noise within a subband to better
match the masking threshold.

An NSQ applied in time domain can spectral shape the
q-noise. What about an NSQ applied in the frequency domain ?
What does it do ? Well, because of the time/frequency duality
it will TEMPORAL-SHAPE the q-noise. Et voilà !

That's the theory. Don't know how well this can be applied
in practice for Vorbis. (has to be investigated)

<p>Ghis!
Sebastian

--
PGP-Key-ID (long): 572B1778A4CA0707

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.