[Flac-dev] Large compression test

Svante Eriksson ser at as9-6-1.mt.g.bonet.se
Sat Aug 4 01:38:17 PDT 2001

Interesting figures.

Some corresponding figures for flac 1.0 for a set of 404
CDs, with a total of 4357 tracks are:

Opt  Uncompressed    Compressed    Ratio  Encode Time
---  ------------   -----------   ------  -----------
-8   234507744748  127468328349   0.5436  n/a

Admittedly, this set includes at least one country album,
and has a fair share of jazz in it.  The worst-case track in
this set has a ratio of 0.9442, due to some heavy use of

Compiling this statistics, I noticed that on two of the
tracks, there was a problem with the MD5 and min/max
framesize being all zeroes, and the seek-table being
corrupt.  Recompressing caused the symptom to disappear.

I have seen a similar behaviour earlier when recompressing
some tracks using flac 0.10, but those symptoms disappeared
after recompression with 1.0.

Your TOC-hash algorithm would be interesting to examine, as
I'm also intending to move the metadata into a database from
a set of plain files.


"JC" == Josh Coalson <xflac at yahoo.com> writes:

JC> A large test I ran on flac 1.0 recently finished so I thought
JC> I'd post the results.  I took about 60 CDs, totalling around
JC> 30 gigs uncompressed, and compressed them all using all 10 of
JC> flac's default compression modes (-0 through -9).  The CDs are
JC> of a wide variety of music; I think the only major genres not
JC> represented are country and rap (freudian slip).  Anyway, the
JC> raw numbers:

JC> Opt  Uncompressed    Compressed    Ratio  Encode Time
JC> ---  ------------   -----------   ------  -----------
JC> -0    18705533061   30308309960   0.6171      3:49:31
JC> -1    18296233794   30308309960   0.6036      3:56:11
JC> -2    18213733462   30308309960   0.6009      4:15:55
JC> -3    17947006657   30308309960   0.5921      4:29:18
JC> -4    17492915245   30308309960   0.5771      4:58:26
JC> -5    17447297373   30308309960   0.5756      5:37:22
JC> -6    17435250044   30308309960   0.5752      5:48:08
JC> -7    17414666580   30308309960   0.5745     10:58:38
JC> -8    17385832738   30308309960   0.5736     13:43:05
JC> -9    17350388508   30308309960   0.5724    282:30:39

JC> Yes, that last one is almost 12 days.  You probably noticed
JC> that the compression difference is about 0.1% from -8,
JC> which is why I say -9 is more theoretical than practical.

JC> The test ran on a PIII-600 and I used -V for all the tests,
JC> so the runtime without -V would be a tiny bit less for
JC> -7 -8 -9 and noticibly faster for the lower modes.  And
JC> I'm glad to say that there were no verify errors and the
JC> decoded WAVs compared exactly to the originals every time.

JC> The range of ratios ranged from 0.20 for some jazz tracks
JC> (quiet Ella Fitzgerald stuff) to 0.78.  The hardest
JC> stuff to encode was constistently by the band Dream
JC> Theater (the ultimate in progressive rock), even harder
JC> than death metal like Cannibal Corpse.  Classical, jazz,
JC> chant were almost always below 0.5.  Rock, techno, world
JC> music usually fell in the range 0.5-0.7.  A ratio of
JC> 0.2 like with some of the jazz and classical tracks means
JC> a bitrate of under 300kbps which is not bad for lossless.

JC> The other interesting thing is that the sweet spot seems
JC> to be more near -4 than -5.

JC> Josh

JC> P.S. My next project is to rip and encode all my CDs and
JC> store the CD metadata in a database.  I've got a nice
JC> schema worked out and a better hash than CDindex for
JC> creating a primary key from the CD TOC.  If there's
JC> interest I can publish the code for the little TOC
JC> reader + key generator (UNIX only, gotta love ioctl).

More information about the Flac-dev mailing list