[flac-dev] Commonly getting FLAC__STREAM_ENCODER_VERIFY_MISMATCH_IN_AUDIO_DATA on valid audio

Brian Willoughby brianw at sounds.wa.com
Sat Feb 9 04:28:57 PST 2013


2013/2/8 Erik de Castro Lopo <mle+la at mega-nerd.com>:
> Collin wrote:
>
>> It turns out it was an error on my part; sorry for the trouble. I  
>> misunderstood
>> the format libflac was expecting my samples in. I found the  
>> wording in the
>> documentation very confusing.
>>
>> My audio file has 2 channels and 16-bit samples, so I have to pull  
>> out each
>> individual sample, push it into a FLAC__int32, and fix endian-ness  
>> before I can
>> pass it to libflac.
>>
>> The documentation made me think that libflac just wanted me to  
>> shove my 2-byte
>> sample into a FLAC__int32. I wasn't aware that it wanted me to  
>> extend the sign,
>> as well.
>
> At some stage it might make sense for us to make the FLAC API a little
> more user/programmer friendly.

I seem to recall that the FLAC library requires fixed point numbers  
that are right-justified in a 32-bit fixed-point 'long int', whereas  
the standard everywhere else seems to be left-justified. Left- 
justified makes more sense because it allows 16-bit, 20-bit, 24-bit,  
32-bit, and other depths to be more easily converted.

It's been a while since I needed to know this level of detail,  
because I wrote my own ObjC library on top of the C FLAC library  
since I didn't want to use the C++ FLAC library, and once I solved  
the justification of the bits I never had to think about it again.

However, for new programs being written to use the FLAC libraries, I  
totally agree that there should be a friendlier interface (assuming I  
haven't remembered all of this wrong).

Note that the old interface should be maintained for compatibility  
with existing software. The new, friendlier interface can simply use  
a slightly different (member) function name to access the same  
objects. There's no reason both can't be supported, or even float as  
well.


> Something to match libsndfile's ease of
> use.

I do not use libsndfile, so I wouldn't make this comparison.


On Feb 9, 2013, at 03:13, Johnny Rosenberg wrote:
> That would be great!
> I was actually starting my small ”project” with libsndfile, but it was
> done in a way that I didn't agree with, and since I only work with
> FLAC files (mostly 24 bits) I decided to try to go with libflac
> instead, which seems to let me do things the way I want more than
> libsndfile.
>
> For example, when loading an audiofile with libsndfile, all the
> samples are converted to floating point numbers, which is what I want
> anyway, but the conversion is done by dividing the integers from the
> file with pow(2,bps-1), but when converting back, they multiply with
> pow(2,bps-1)-1, so if you just read and then write, you end up with a
> slightly lower volume (not that you can hear any difference if you do
> it only a few times, but still, it isn't right, AND it is very easy to
> do it right, so why don't they?)…
>
>
> Johnny Rosenberg, the beginner


For a beginner, you sure have the math right. Lossless conversion  
should always be handled via a factor with only a single significant  
digit, otherwise you're actually performing a sum. Multiplying by a  
factor with a single (binary) significant digit is equivalent to  
multiplying by 1. pow(2, N) is a valid conversion factor. pow(2, N)-1  
is not. Basically, converting a 16-bit sample using pow(2, 15)-1 is  
the same as summing 15 lossless conversions that happen to be one off  
from the proper lossless conversion.

The 'why' is that libsndfile hopes to solve a problem that doesn't  
always exist. Namely, that scientific synthesis of data in floating  
point often ranges from +1 to -1, but the fixed point output format  
of most sound files does not allow +1 (even though it does allow -1).  
This is due to the nature of twos-complement fixed-point numbers. I  
have no idea why libsndfile handles integer input to integer output  
in this lossy way, but that was a choice made by libsndfile that  
seems beyond the possibility of correcting. It would have seemed much  
simpler to require floating point data synthesis to apply the pow(2,  
N)-1 scaling only as needed, rather than apply it to all audio files  
(most of which come from fixed-point A/D convertors that don't need  
any scaling). In other words, libsndfile is really only correctly  
applicable to a very esoteric set of data which does not include most  
audio.

Regardless of libsndfile's architecture and design, libFLAC should by  
no means duplicate the incorrect conversion. There especially should  
not be an asymmetrical set of conversion factors.

All of Apple's CoreAudio tools use float without lossy conversion.  
The only requirement there is that synthesized data should avoid +1.0  
sample values unless clipping is acceptable or scaling is added  
before conversion.

Brian Willoughby
Sound Consulting



More information about the flac-dev mailing list