[flac-dev] 2GB limit patch

Brian Willoughby brianw at sounds.wa.com
Wed Mar 6 00:43:17 PST 2013


On Mar 4, 2013, at 14:39, Timothy B. Terriberry wrote:
> Erik de Castro Lopo wrote:
>> Err, thats a link to a post talking about flac's WAV reader being  
>> limited
>> to 4Gig files. Problem is, *all* WAV files greater than 4Gig are  
>> mal-formed.
>> Due to limitations in the way WAV files are specified, no valid  
>> WAV file
>> can ever be over 4Gig.
>
> And most don't work over 2 GB. The solution we (Xiph) have used in  
> other
> projects (opusenc, oggenc) is to treat sizes over ~2 GB as meaning  
> "Just
> keep reading until EOF." See wav_open() audio-in.c in opus-tools. It
> turns out you need to use slightly less than 2 GB for the limit to
> handle the output of some tools. We also provide an --ignorelength
> option to request this behavior regardless of the length in the header
> (which lets it work with tools that just truncate the size to 32  
> bits).


That's an awful "solution" waiting for errors to be introduced into  
what might otherwise be valid files.

There is no way for a RIFF/WAVE to exceed 4 GB because all chunks  
must be enclosed within a global chunk, which is limited to a 32-bit  
size.

I've seen audio software introduce a horrendous glitch in the audio  
by interpreting the ID3 data as part of the audio chunk. Under no  
circumstances should a lossless encoder like FLAC accept a malformed  
audio file and blindly treat potentially random data as audio.

Case in point: I just downloaded a 24-bit WAV that was truncated  
without warning. Thankfully, when I tried to compress it with FLAC, I  
received a warning that the file was too short. That error prompted  
me to download the file a second time, and I got the whole thing on  
the second try. Lossless is useless if you don't even know that the  
original file is missing pieces.

The proper solution is to use POSIX API that support 32-bit unsigned  
file offsets instead of the older 32-bit signed offsets (which  
supposedly still have merit because they allow seeking backwards from  
the current position).

There should be no valid reason to "just keep reading until EOF" in a  
format like WAV that has two, redundant chunk sizes defined (the  
'RIFF'/'WAVE' chunk encloses all other chunks, the 'data' chunk  
encloses all valid audio samples).

Brian Willoughby
Sound Consulting



More information about the flac-dev mailing list