[flac-dev] Autocorrelation precision insufficient

Mon Jun 28 06:34:12 UTC 2021

Op ma 28 jun. 2021 om 04:59 schreef Brian Willoughby <brianw at audiobanshee.com>:
> I'd say that it's important to maintain compatibility with players that have
> limited capabilities. I don't actually know which levels of compression the
> SD702 supports, and which levels are too extreme, but whatever the
> threshold is, then I would expect that future versions of FLAC would not stop
> being compatible with the SD702 when the same compression level is used.

No worries, this change does not affect decoding complexity.

> I've also never understood the mathematics behind a compression format
> where the encoder can vary immensely in quality - even to the point of
> exceeding original compression ratios - but the decoder remains compatible.
> Witness MP3, where they improved the algorithm of the encoder and produced
> smaller files, all without requiring an update to the decoder. I'm assuming that
> FLAC fits that same model.

FLACs inner workings are not very difficult, I will explain. FLAC does
encoding in
two steps. First, it tries to find a model to fit the audio data. As a
model is never
exact, the second step involves storing the difference between the input and the
model (called the residual) in an efficient way.

How this model and residual are stored and decoded is part of the FLAC
specification, this doesn't change. The only change to the specification
after FLAC version 1.0 concerned 24-bit files and was the introduction of a
different residual encoding method in version 1.2.0, in 2007. No other
changes to the format and thus decoding were introduced after 1.0.

The different compression levels increase one of two things: how much
effort to spend finding the best model, and how complex a predictor to use.
The first, the effort finding the best predictor, slows down the
encoding but not
decoding. These are options like adding apodizations and exhaustive model order
and precision search. With these settings, the encoder tries various models and
picks the one that fits best. The second, how complex of a predictor
to use, does
affect decoding performance. There are only three parameters here:
whether to use
a fixed or non-fixed predictor, how large the maximum model order is and whether
mid-side encoding is used.

The change I propose is a change in the way the encoder does the modelling.
When this modelling is done with more precision, this slows down encoding,
but it does not change anything in the way decoding is done. It just changes the
'representation' that needs to be decoded, but this decoding process does not
become more complex.