[flac-dev] Looking for users of --keep-foreign-metadata

brianw brianw at audiobanshee.com
Mon Oct 31 00:37:32 UTC 2022

I use this feature whenever archiving audio files created by specific hardware, where the metadata might be important to retain for the future. There are also specific software examples where added metadata is important to preserve.

I have not looked into the details, so I can't give an opinion about "complete" without first doing some research.

In general, though:

- Preserving meta data should ideally work for both WAV and AIFF (although it's not possible to translate meta data from one variant to the other).

- There is a generally-accepted industry practice concerning these extra chunks that points out that it's problematic to preserve all types of chunks when editing the audio. For example, a visual overview or a cue list might easily become incorrect if the audio is changed. Of course, there are some chunks that can always be preserved, like original recording date, name of engineer, etc., but FLAC cannot know how to distinguish between the two types. My opinion is that FLAC should not suffer from this category of issues because the audio is specifically unchanged by the compression. Thus, it should be the case that all meta data can be preserved without risk, since the audio is also preserved exactly.

- However, since FLAC is highly optimized for compressing audio, I believe that meta data preservation should not even attempt to compress non-audio chunks. Since meta data preservation is optional, the lack of compression should not be a concern for users who do not enable the option.

- Related to preserving meta data without compression, I believe that FLAC should be completely unconcerned about the validity of the meta data that it preserves. As long as the chunk length is valid (i.e. does not exceed the bounds of the uncompressed WAV/AIFF file), it is completely irrelevant to FLAC as to whether the contents of those extra chunks are valid or not.

- The philosophy of FLAC should be to focus on lossless audio compression, and only minimal support for preserving non-audio parts of the files. i.e. I do not support adding a huge amount of complexity to the FLAC library just to support esoteric details of non-FLAC formats.

Reading your email question, Martijn, I think you might be asking whether FLAC should always preserve meta data - "restore ... bit-for-bit ... something that FLAC should strive for" - or leave this as an optional feature, so I want to comment on the two scenarios:

A) When no options are provided to FLAC, it should not preserve the entire input file, bit-for-bit. Only the audio should be preserved bit-for-bit (and checksummed, as a proof of bit-for-bit correctness). In this scenario, a WAV file could be compressed to FLAC, and decoded as AIFF - or an AIFF file could be compressed to FLAC and decoded as WAV. This use-case is for those who want to preserve the audio, and who are not really concerned about the container.

B) When --keep-foreign-metadata is added as an option, I believe FLAC should strive to preserve all bits of the input file, and I believe that the feature should allow that (although perhaps the code might need some modifications). This thread has brought some potentials errors to light that I may not have noticed. (*)


p.s. I like the new option --keep-foreign-metadata-if-present (although I might have preferred --restore-foreign-metadata-if-present). I'd like to agree with Federico, except I believe there's just as much danger to end-users if the original option, --keep-foreign-metadata, were to create a file without meta data. My reasoning is that users might not always read the warning messages - partly because some users do not see the command-line through their GUI front end, and partly because FLAC is already fairly verbose when decoding and there are already several warnings that need to be ignored. Since the current behavior fails to create an output file when meta data is missing on the input, and since running the command again without the --keep-foreign-metadata option is a quick change, I think it's best for FLAC to interpret the command-line options literally, and refuse to create an output file until the options are adjusted to fit the exact situation. This seems like the best way to give every user what they want, even though the use cases vary quite a bit for FLAC.

p.p.s. AIFF
I cannot remember whether --keep-foreign-metadata works with AIFF, but if it does not currently, then perhaps this is the time to add support (or test any existing support). The mechanism used for WAV will work equally for AIFF, since both use similar chunk size and chunk name headers.
The practical consideration is that there are fewer industry-crucial meta chunks in AIFF, so perhaps it's less useful to preserve everything. WAV has the BWF variation - which is basically a standard WAV with special chunks that promote the file to BWF status - making it important to preserve these chunks when compressing BWF. I am not aware of anything quite so special about AIFF chunks. That said, I have a few audio applications that put their own custom chunks in AIFF files, and it's useful to preserve those so that projects can be compressed, archived, and restored without loss of any special audio application additions.
I tend to use AIFF, but some hardware works better with BWF, so my use of FLAC covers both scenarios, with an eye towards bit-for-bit archival wherever possible.

(*) I'm only now just realizing that WAV files might not be perfectly preserved, due to the WAVE_FORMAT_PCM versus WAVE_FORMAT_EXTENSIBLE issue within FLAC. I've not looked at the details, but it seems that --preserve-foreign-metadata should keep an additional copy of the original WAV header for optional restoration. This feature should work using the same technique used to preserve all other chunks, and the WAV header is small enough that the redundant copy should not pose a concern. I suppose it depends upon whether we consider it a bug when restoring a WAV with preserved metadata, if it's not bit-for-bit identical. I'm leaning towards making a little extra effort to allow bit-for-bit restoration. On the other hand, FLAC is not necessarily the place for extensive source code to manage problems with incomplete documentation of the specification, or aberrant software in the field. That could easily get out of hand, and belongs in other tools for audio files.

On Oct 30, 2022, at 7:06 AM, Martijn van Beurden <mvanb1 at gmail.com> wrote:
> Hi all,
> Currently I'm looking for users of the --keep-foreign-metadata feature
> of FLAC. There has been some improvement of this feature in FLAC
> 1.4.0. Since 2007 there has been a warning in FLAC that
> --keep-foreign-metadata is a new feature. I think removal of this
> warning is long overdue, but there are still some issues surrounding
> it.
> So, if there are users of this feature on the mailing list, could they
> perhaps speak up? Can this feature be considered 'complete'? Currently
> FLAC stores the top-level RIFF chunk and fmt chunk on encoding, but
> does not restore them on decoding, is this considered a problem or
> shortcoming?
> I know for example that WavPack will restore a WAVE file bit-for-bit,
> even if there is ambiguous or even invalid data stored in the format
> chunk. I don't think such behaviour is something that FLAC should
> strive for. The current behaviour of storing metadata that is not
> essential for decoding the file, for example CUE, LIST, bext chunks,
> is I think sufficient, but I would like to hear the opinion of people
> that actually use this feature.
> Kind regards,
> Martijn van Beurden

More information about the flac-dev mailing list