[vorbis-dev] Transient coding: AAC vs. Vorbis

Thu Jun 10 08:53:45 PDT 2004

>> Vorbis makes use of any transform you want.  Currently there's
>> only one transform defined, and that's the MDCT, sure.  That
>> doesn't mean we're stuck with it forever.
>
> in case of Vorbis I, yes. That's what I meant.

You *can* have different transforms in Vorbis I files; they just
won't be "standard", and so can't be decoded by standard decoders.

But there's nothing stopping you from experimenting with new
transforms, if you want -- no need to wait for Vorbis II.

>> Sharing the floors decreases the space needed for the
>> floors, but increases the space needed for the
>> residues.
>
> it depends on how similar these floor curves would be.
> the possibiliby of sharing the curves of several small
> transform blocks is IMHO an advantage and an easy-to-do
> thing for Vorbis II. It can be used for a more compact
> representation without introducing more complexity.

I'm not saying it can't help; it just requires some
experimenting.

The residue typically takes up much more space than the
floor, so that points to having more exact floors per
block probably being advantageous; on the other hand,
for example, in the higher frequency bands, the
residue data is a) almost pure noise and b) very often
only ones and zeroes, so sharing floor curves there might
help.

If you have time, you might want to do some testing...
Full tuning will take a lot longer, though :-(

>> So this is a tradeoff.  Also, deciding
>> per group of packets if floors should be shared again
>> wastes a few bits; more tradeoffs.
>
> i wouldn't say these bits are wasted because these
> bits are used as a replacement for multiple floor curves.

s/wastes/uses/ , okay.

> in case you only want to use short-transform packets which only
> contain one MDCT spectrum per channel (Vorbis I style) this approach
> only 'wastes' _ONE_ bit per short-transform-packet.

And in 32kbps/channel you only have _NINETY-THREE_ bits per such
a block _total_.

> if some floor curves would be very similar (not unlikely) you can
> save one floor encoding for another bit.
> the overhead is really really small compared to the savings you
> can get.

You can also code the difference between two floors, if they are
very similar.  Or do either of these schemes for the high
frequency bands only (or any other subset of the frequencies).
etc. etc.  More experimenting is needed!

> do you have anny comments like "yes, i think it's a good idead"
> or "no, i don't like the idea because ..." ? :)

"Yes, this might save some bits; no, I don't think it's as easy
as this".

It can cause bad (i.e., not natural sounding) artifacts because
it does some sort of blocking.

Also, quite importantly for Vorbis, audio packets won't be
independently decodable anymore.  You'll only need to go
back a few packets to be able to decode it, but...

Have fun,

Segher