[Flac] technical questions about FLAC

Thu Apr 16 05:53:26 PDT 2009

On Apr 14, 2009, at 02:05, Casey wrote:
> I have a few questions about FLAC that aren't answered in the FAQ  
> or the documentation.  Maybe someone knows about these details.

I can't answer all of these without guessing on some...

> 1. What exactly is meant by "reference encoder"?

Many codecs including FLAC allow for multiple encodings of the same  
source data, but decoders must always produce the right results.  For  
FLAC, reference encoder means that even though there might be many  
valid encodings of a particular source file, the reference encoder  
produces ones that are guaranteed to be correct.  In other words, if  
your decoder fails on data that comes from the reference encoder,  
then it means your decoder is broken.  However, there are now other  
encoders besides the FLAC reference encoder, and if you can't decode  
those, then it's not clear which one is wrong.  I'm not sure that I  
answered your question, though.

> 2. Concerning interchannel decorrelation, what is the point of  
> creating separate signals, why not just keep the channels you  
> have?  For example, you might create "mid" and "side" channels, but  
> you still end up with 2 so what's the point?  The only reason I can  
> think of is maybe a couple of bits are saved by storing the  
> "side" (difference) value.

FLAC is different from MP3.  MP3 loses information, so it's able to  
squeeze all different kinds of signals into the same space.  So, for  
MP2, 2 channels of any kind of audio will end up the same size.   
FLAC, however, must retain all of the original information, and  
therefore each different signal ends up with a different size.   
Quieter pieces are smaller, louder pieces are larger.  Frequency  
content also affects size.  One thing about mid/side is that mono  
material would produce silence on the side channels, and thus the  
side channel would compress a lot more than either left or right.   
Even a stereo signal that's mostly mono would benefit, because the  
side channel would be much quieter than the mid, left, or right.

> 3. Is left-side (or right-side) just the left/right channel paired  
> with the difference between left and right?

Yes, and the reason this is helpful is because of the variable size  
output I mentioned above.  If left compresses smaller than right,  
then left-side should be smaller than right-side.  Also, if the left  
or right channel compresses smaller than the mid channel, then you  
wouldn't want mid-side.  It's difficult to predict, so the FLAC  
encoder tries it every way, just to see which comes out smaller.   
Since the left-right, mid-side, left-side, right-side options can be  
selected independently for each block, the whole file can end up much  
smaller by picking the most optimum channeling for each block as the  
audio changes.

> 4. Why was 655,350 Hz chosen to be the maximum supported sample  
> rate?  I noticed it's not a power of 2.  Also, this is an  
> unimaginably fast frequency, doesn't it exceed the limit for most  
> channels?

20-bit frequency in the stream header should allow for up to  
1,048,576 Hz, but the documentation points out that there are  
limitations in the block headers.  The block header has a 4-bit code  
to select among a few common frequencies, then has breakout codes for  
8-bit kHz, 16-bit Hz, or 16-bit dHz (d- for deca-).  In other words,  
frequency can be specified in ones, tens, and thousands.  When  
specified in thousands, you only get 8 bits, so the maximum frequency  
with that code is 256*1000 or 256 kHz.  When specified in tens, you  
can reach 65,536*10 Hz or 655,360 Hz.  The reason these limits are  
not powers of 2 is because the specification allowed for human  
factors such as tens and thousands.  You'll note that 655,536 is a  
product of a power of 2 and a power of 10.

Brian Willoughby
Sound Consulting