[Flac-dev] Large compression test
Svante Eriksson
ser at as9-6-1.mt.g.bonet.se
Sat Aug 4 01:38:17 PDT 2001
Interesting figures.
Some corresponding figures for flac 1.0 for a set of 404
CDs, with a total of 4357 tracks are:
Opt Uncompressed Compressed Ratio Encode Time
--- ------------ ----------- ------ -----------
-8 234507744748 127468328349 0.5436 n/a
Admittedly, this set includes at least one country album,
and has a fair share of jazz in it. The worst-case track in
this set has a ratio of 0.9442, due to some heavy use of
distortion.
Compiling this statistics, I noticed that on two of the
tracks, there was a problem with the MD5 and min/max
framesize being all zeroes, and the seek-table being
corrupt. Recompressing caused the symptom to disappear.
I have seen a similar behaviour earlier when recompressing
some tracks using flac 0.10, but those symptoms disappeared
after recompression with 1.0.
Your TOC-hash algorithm would be interesting to examine, as
I'm also intending to move the metadata into a database from
a set of plain files.
/Svante
"JC" == Josh Coalson <xflac at yahoo.com> writes:
JC> A large test I ran on flac 1.0 recently finished so I thought
JC> I'd post the results. I took about 60 CDs, totalling around
JC> 30 gigs uncompressed, and compressed them all using all 10 of
JC> flac's default compression modes (-0 through -9). The CDs are
JC> of a wide variety of music; I think the only major genres not
JC> represented are country and rap (freudian slip). Anyway, the
JC> raw numbers:
JC> Opt Uncompressed Compressed Ratio Encode Time
JC> --- ------------ ----------- ------ -----------
JC> -0 18705533061 30308309960 0.6171 3:49:31
JC> -1 18296233794 30308309960 0.6036 3:56:11
JC> -2 18213733462 30308309960 0.6009 4:15:55
JC> -3 17947006657 30308309960 0.5921 4:29:18
JC> -4 17492915245 30308309960 0.5771 4:58:26
JC> -5 17447297373 30308309960 0.5756 5:37:22
JC> -6 17435250044 30308309960 0.5752 5:48:08
JC> -7 17414666580 30308309960 0.5745 10:58:38
JC> -8 17385832738 30308309960 0.5736 13:43:05
JC> -9 17350388508 30308309960 0.5724 282:30:39
JC> Yes, that last one is almost 12 days. You probably noticed
JC> that the compression difference is about 0.1% from -8,
JC> which is why I say -9 is more theoretical than practical.
JC> The test ran on a PIII-600 and I used -V for all the tests,
JC> so the runtime without -V would be a tiny bit less for
JC> -7 -8 -9 and noticibly faster for the lower modes. And
JC> I'm glad to say that there were no verify errors and the
JC> decoded WAVs compared exactly to the originals every time.
JC> The range of ratios ranged from 0.20 for some jazz tracks
JC> (quiet Ella Fitzgerald stuff) to 0.78. The hardest
JC> stuff to encode was constistently by the band Dream
JC> Theater (the ultimate in progressive rock), even harder
JC> than death metal like Cannibal Corpse. Classical, jazz,
JC> chant were almost always below 0.5. Rock, techno, world
JC> music usually fell in the range 0.5-0.7. A ratio of
JC> 0.2 like with some of the jazz and classical tracks means
JC> a bitrate of under 300kbps which is not bad for lossless.
JC> The other interesting thing is that the sweet spot seems
JC> to be more near -4 than -5.
JC> Josh
JC> P.S. My next project is to rip and encode all my CDs and
JC> store the CD metadata in a database. I've got a nice
JC> schema worked out and a better hash than CDindex for
JC> creating a primary key from the CD TOC. If there's
JC> interest I can publish the code for the little TOC
JC> reader + key generator (UNIX only, gotta love ioctl).
More information about the Flac-dev
mailing list