[flac-dev] Two questions

Brian Willoughby brianw at audiobanshee.com
Thu Aug 5 03:56:16 UTC 2021


On Aug 4, 2021, at 00:10, Federico Miyara <fmiyara at fceia.unr.edu.ar> wrote:
> Brian,
> 
> Once more, thanks for taking your time to answer my questions and provide interestig insights. Some comments below.
>  
>> I recommend writing your own utility based on the FLAC library, in C, with the features you want. I do not recall any feature in the flac command line utility that would allow this. Your workaround is a reasonable attempt, but it seems to have too many undefined side-effects.
> 
> I'm not a C programmer, which I regret, but that's how things are.

Not to wax too philosophical, but nobody is born a C programmer. Things do not have to remain how they are!

However, Martijn has a great suggestion: Use the metaflac command line, and I think this will do a lot for you.


> My quest has to do with economy, I have already a function that can read the metadata from a wav file, being the only required information where the metadata start (which for the wav file I know). Exactly the same function can be used with a flac file without using any codec, as far as I can find out the beginning of the "local copy" of the metadata.

Be careful. Many audio programs have failed by attempting to hard-code the beginning a certain pieces of data in a file. A related failure is to assume the location of the end of data. Both failures can create audio glitches.

RIFF (WAV) and IFF/FORM (AIFF) files are made from a sequence of chunks, and most chunks can appear in any order. I recall that WAV is slightly more restrictive than AIFF, but I've certainly seen errors in WAV software due to fixed file offsets.

Similarly, FLAC files are a sequence of blocks. Your best bet is to create functions that can properly scan blocks, and then once the 'riff' block is found, you can use a WAV function to properly scan chunks.

But don't worry too much about economy. Your function only needs to read 4 bytes from each FLAC block in order to determine type and size. When you find an APPLICATION block, you can read 4 bytes to look for 'riff'. If an uninteresting block is found, then seek in the file according to the size of the block, and read 4 bytes to examine the next block. Proper code can scan even very large FLAC files very quickly by reading only 4 or 8 bytes at a time and then seeking over the uninteresting data rather than actually reading it from the file into memory.

When writing C code, I just use the virtual memory support to map the entire audio file into memory, and then only read a few bytes. The virtual memory paging system will not read from disk except as necessary.


>> The --keep-foreign-metadata feature was added to the command-line application after the FLAC format was finalized. The metadata ends up in an APPLICATION block, which is usually skipped by the FLAC library decoder. These are intended for third-party applications, and thus it's typically impossible to document them. Normally, a third-party software developer would add their own proprietary block to the FLAC file, and all other applications would just skip over it (because all blocks have a universal name and length at the start).
> 
> This information is most useful for me, since at least now I know the name of the block containing the foreign metadata, and I know it is previous to the audio data.

I never noticed that the audio block is last in a FLAC file. I'm used to AIFF and WAV, where chunks can almost appear in any order. This is good news because you'll never seek into the audio data.


> I could manually "read" the first few metadata blocks (following the format specification) and found that there is a seek table whose length is roughly proportional to the size of the audio samples, then a Vorbis comment indicating the version of the FLAC libraru'y, and then the Application block which contains the data I'm interested in. This makes its position predictable so I can find it without having y¿to read all the file in search of some key words!

The position is not guaranteed to be predictable. If you design your Scilab algorithm based on predictable positions, then you'll probably end up with issues. It will be more successful to create a block scanner, reading 4 or 8 bytes per block, in order to find the APPLICATION 'riff' block that you want. The algorithm will be very similar to scanning chunks in the WAV file.


>> The only documentation of the APPLICATION block format is probably the source code for the flac command line utility. I did not design this, but I remember suggesting it a few times. Basically, the entire WAV or AIFF contents are in the block, verbatim, except for the chunk that would contain the audio. Since the FLAC data outside the APPLICATION block already contains the audio, that chunk is empty in the APPLICATION block.
> 
> I wonder why there is a long run of zeros (about 8192 zero bytes) in the example I'm attaching, almost as long as the audio residuals. This is a flac created applying the flac.exe to a wav file. I've seen that there is a padding block to allow

FLAC files have padding by default. I always create my FLAC files with the --no-padding option to avoid that. The padding speeds up frequent access to the audio data, but since I use FLAC for archival purposes, it's not really that important to optimize for decoding. Instead, I optimize for space savings.

I'm going to bet that there are fewer than 8192 zero bytes. There should be just enough so that the FLAC STREAMINFO and other blocks plus the zero bytes ends at exactly the 8192 offset in the file. If there are a lot of metadata blocks in the FLAC file, then there will be a lot fewer than 8192 zero bytes. But, as I said, this is not always true because a FLAC file can omit the padding.


>> By the way, one of the challenges of making a completely lossless WAV or AIFF compressor is that there is no predefined order for the various chunks in those files. The audio data chunk can appear before or after various other optional chunks. The solution for FLAC was to have that empty chunk inside the APPLICATION block. For WAV, the audio chunk is named 'data' and for AIFF the audio chunk is named 'SSND'. All other chunks are copied verbatim, but these audio chunks only have a name and size with no further bytes. It's basically a marker. I'm pretty sure that's how it was implemented, but you can check the flac command line source to confirm.
> 
> This could easily be improved (even if for my purpose it would be fatal) if the non-audio chunks were zipped.

I'm going to make another bet that the non-audio data in a WAV or AIFF is almost always less than the 8192-byte padding that's typical for a FLAC file. In other words, the default FLAC file will probably be the same size, whether you use --keep-foreign-metadata or not. Thus, there's really nothing to be gained by compressing the non-audio data.

It would be fatal - not just for you but for compatibility with all FLAC software - to change the existing APPLICATION blocks for 'riff' or 'aiff'. Instead, someone who really wants a little more compression could apply for a different application ID and use that to create a more efficient, but still lossless way to preserve the foreign metadata. This is probably a lost cause, because compression of non-audio data uses different algorithms than compression of audio data, and thus very little of the FLAC compression code could be used. This is a lot of effort for very little return, but there are still application IDs to be had if someone really wants to do this.


> Regards,
> 
> Federico Miyara



More information about the flac-dev mailing list