[flac-dev] Autocorrelation precision insufficient

Mon Jun 28 09:59:13 UTC 2021

Op ma 28 jun. 2021 om 09:45 schreef Brian Willoughby <brianw at audiobanshee.com>:
> 1) What would affect the ability of an embedded FLAC firmware to decode certain files?
>
> Is it possible that the SD702 is not capable of processing complex predictors of higher orders?

I cannot say for this specific case. If you would encode a file with
all settings -0 through -8 and try playing them, it would be possible
to draw conclusions. Have you done this before? The manual of the
product you mention does not seem to list any limitations, and
according to several sources, for most embedded devices FLAC decoding
is much easier than MP3 decoding. See
https://www.rockbox.org/wiki/CodecPerformanceComparison

If only setting -0 and -3 play, the unit does not do mid-side
decoding. If only -0, -1 and -2 play, the unit does only fixed
predictor decoding. If only -0, -1, -2 and -3 play, the unit does not
support predictor order larger than 6. If only -0, -1, -2, -3, -4, -5
and -6 play, the unit does not support predictor orders larger than 8.
If none of these work, perhaps the unit only plays files it made
itself.

> 2) Is there a strict definition of the 9 levels of FLAC coding such that the level would
> precisely determine the predictor?
>
> I haven't looked at the source in ages, but it seems like you're saying that the compression
> level might map to exact choices for model/predictor.

There is no definition of the levels in the format specification, it
just hasn't changed in all these years for the reference encoder. The
*only* change the last 20 years was that level 7 went from max order 8
to max order 12 in version 1.3.1. However, other FLAC encoders like
Flake and FLACCL define levels differently.

> 3) Can you explain why the encoded file should be exactly the same across different
> processors or implementations? I'm still fuzzy on the details, but it seems like the
> exact bits could still vary even with the same model - i.e. "exhaustive" implies that
> different implementations might stop at different points along the way.

The change I propose is changing autocorrelation calculation from
single-precision floats to double precisions floats. However, it is
impossible to implement this change in the current SSE code, because
SSE cannot work with double precision floats. To use double precision,
SSE2 is needed. My question was: should I replace the SSE routines
with SSE2 routines, or should I keep the SSE routines as well? The
choice for platforms with SSE but without SSE2 is between speed and
compression. If the SSE routines are kept, the speed will stay the
same but there won't be a compression improvement. If the SSE routines
are dropped, the regular code that doesn't use any special extensions
is used, so doubles can be used. This results in a large speed
decrease but compression improves.

If the SSE code is kept around and the VSX code is updated as well,
all platforms (ARM, PPC, x86, etc.) will produce almost the exact same
result when compressing a file with a certain setting, except those
with SSE but without SSE2.