[flac-dev] Questions about FLAC documentation

Sat Oct 6 06:33:49 PDT 2012

Am 06.10.2012 11:01, schrieb Gravis:

> In "FRAME_HEADER" there is a field of a variable size field with the
> description "if(variable blocksize) \n <8-56> : 'UTF-8' coded sample
> number (decoded number is 36 bits)" and I find the encoding scheme is
> somehow alien (I can't figure out what it has to do with UTF-8)  and
> it's two following fields to be incomprehensible.  There doesn't seem
> to be any information indicating their purpose either.

Well, as the documentation states, it uses the same method as in UTF-8 
to store variable length integers:

- read one byte B0 from the stream
- if B0 = 0xxxxxxx then the read value is B0 -> end
- if B0 = 10xxxxxx, the encoding is invalid
- if B0 = 11xxxxxx, set L to the number of leading binary 1s minus 1:
     B0 = 110xxxxx -> L = 1
     B0 = 1110xxxx -> L = 2
     B0 = 11110xxx -> L = 3
     B0 = 111110xx -> L = 4
     B0 = 1111110x -> L = 5
     B0 = 11111110 -> L = 6
- assign the bits following the encoding (the x bits in the examples) to 
a variable R with a magnitude of at least 56 bits
- loop from 1 to L
     - left shift R 6 bits
     - read B from the stream
     - if B does not match 10xxxxxx, the encoding is invalid
     - set R = R or <the lower 6 bits from B>
- the read value is R

The following two fields depend on the block size and sample rate index 
read earlier in the header:

- If blocksize index = 6, read 8 bits from the stream. The true block 
size is the read value + 1.
- If blocksize index = 7, read 16 bits from the stream. The true block 
size is the read value + 1.

- If sample index is 12, read 8 bits from the stream. The true sample 
rate is the read value * 1000.
- If sample index is 13, read 16 bits from the stream. The true sample 
rate is the read value.
- If sample index is 14, read 16 bits from the stream. The true sample 
rate is the read value * 10.

> The location and coding of audio samples is very nebulous in that I
> don't know where they are or the specifics of how any of their
> encoding scheme work.  More details, links to more information and
> maybe even some pseudo code would be very helpful.  The documentation
> about the metadata is great...  but it kinda goes downhill after that.
>   It would be fantastic if someone could update the documentation with
> more information and details.

It's almost 10 years ago, so I am not 100% sure, but I think I 
implemented the FLAC decoder in the J-Ogg project based only on the 
textual specification. It may of course be that I checked the reference 
implementation for details, but I don't think so. What is your exact 
problem? The specification may not be perfect, but it should be 
understandable. Perhaps it may help you to consult the Java source from 
J-Ogg: http://www.j-ogg.de/j-ogg.zip. It should be more readable than 
the reference implementation. The actual PCM decoding is implemented in 
the Subframe class in the FLAC directory.

Tor