Home
Home
Blog
Blog
Old Projects
Old Projects
Current Projects
Current Projects
Publications and Presentations
Publications and Presentations
Fun Stuff
Fun Stuff
Chairman Miau
Chairman Miau
Pictures
Pictures
Whats Inside?
Whats Inside?
Sooty
Other People
Polls
Polls
Disclaimer

Size

The tools were measured on three criteria: file size, compression time, decompression time.

/bin/*

This test tries to compress the contents of my /bin directory. There are 97 executables to compress accumulating to 9Mb of data. As with all other tests: the lower the better.
7z 26.0%
ace 31.5%
arj 48.7%
jar 48.8%
lha 48.2%
rar 43.3%
sit 45.9%
tar.bz2 41.3%
tar.gz 47.8%
tar.lzo 51.0%
tar.Z 63.8%
zip 48.6%
zoo 63.9%
The big winners in this test seem to be 7z and ace. At 26% the 7z compacted archive is nearly half the size of most other archives. Ace also gets good compression but then there is a big gap before the third place. Bz2 showing here that it can beat Rar but they both are bunched with the other weaker compressors.

bitmap

The bitmap test file is a lovely picture of Will after being converted to a 24bit bmp. The file is 1.9Mb long and should compress very nicely.
7z 12.2%
ace 14.8%
arj 17.0%
jar 17.4%
lha 15.7%
rar 11.7%
sit 10.3%
tar.bz2 10.2%
tar.gz 16.7%
tar.lzo 21.6%
tar.Z 24.9%
zip 16.7%
zoo 30.3%
Sit and tar.bz2 are neck and neck here while ace and 7z are this time left behind.

html

The html test is just 3 html files (slashdot front page, google page searching for "compression", and my homepage. This would test how the tools deal with small amounts of data (in this case 148Kb).
7z 19.5%
ace 20.4%
arj 21.9%
jar 21.9%
lha 21.6%
rar 19.1%
sit 20.4%
tar.bz2 19.9%
tar.gz 21.3%
tar.lzo 24.5%
tar.Z 38.0%
zip 21.8%
zoo 43.3%
With the exception of tar.Z and zoo most compressors gave very similar ratios (with the usual suspects getting a percent or two better than others).

jpeg

Often archives contain compressed files and the archive is just used to stick them all together. In this test a set of Jpeg files (which are already compressed) were recompressed. Often archivers need to know when not to compress as compressing will give larger results than the original.
7z 97.0%
ace 97.3%
arj 97.6%
jar 97.7%
lha 97.5%
rar 97.7%
sit 97.4%
tar.bz2 97.8%
tar.gz 97.7%
tar.lzo 98.7%
tar.Z 122.0%
zip 97.7%
zoo 100.0%
In the tar.Z test is a perfect example of compressors not knowing when to just store uncompressed. The other compressors seem to have found little niches where to optimise the files but only by two to three percent.

pdf

Pdf files are a strange mixture of ascii and binary so a static ascii compression would not work well. Here the test was done on three large files from my research.
7z 58.7%
ace 60.3%
arj 62.0%
jar 62.9%
lha 62.2%
rar 60.5%
sit 61.7%
tar.bz2 61.0%
tar.gz 62.2%
tar.lzo 63.3%
tar.Z 88.4%
zip 62.1%
zoo 87.8%
I was expecting somewhat more interesting results from this one but the ratio distribution of the "good" compressors falls within 5%. As usual tar.Z and zoo perform poorly.

Linux

Here is a large test. The linux kernel takes up 88Mb. It is normally distributed in tar.bz2, but how do other compressors compare. There are over 4000 files most being C source. A good compressor would bundle all files of the same extension and compress them together.
7z 17.3%
ace 19.6%
arj 26.0%
jar 26.1%
lha 25.5%
rar 20.8%
sit 23.5%
tar.bz2 18.5%
tar.gz 23.2%
tar.lzo 26.8%
tar.Z 39.6%
zip 26.0%
zoo 41.6%
Well here you can see why they use bz2 to distribute the kernel. Even without being able to rearrange files it manages to be the second best. 7z wins the test.

Roms

A commonly compressed file is the rom images used in emulators. Here the roms being compressed are the X-MESS bios images. The files total 53Mb and are normally distributed compressed in zips.
7z 20.1%
ace 26.4%
arj 46.9%
jar 47.1%
lha 46.6%
rar 43.8%
sit 47.3%
tar.bz2 45.8%
tar.gz 46.7%
tar.lzo 50.6%
tar.Z 65.7%
zip 47.0%
zoo 60.6%
Here I had to redo the test several times because I could not understand how 7z managed to get such a good result. At 20% it is less then half the size of the zip. Ace also performs very well. The remaining tools do not come even close. There is a graph of all above tests.

Next Page