Michal Čihař - Impressed by xz compression

Impressed by xz compression

I knew that xz (or lzma) provides better compression ratio than bz2 or others, but I never thought the difference might be so huge. Simply I was impressed after I've enabled xz compressed snapshots for Gammu - the bz2 compressed tarball has 5.4M while xz compressed on only 1.6M. Wow.

Comments

kubrick wrote on Jan. 6, 2011, 6:22 p.m.

did you try 7zip ?
its amazing too.
i think you could compare it with XZ.
;)
cheers!

Sleep_Walker wrote on Jan. 6, 2011, 7:31 p.m.

7zip uses LZMA compression too

wrote on Jan. 6, 2011, 7:58 p.m.

I've used 7zip before, but I never got such big difference.

wrote on Jan. 11, 2011, 1:54 a.m.

@Michal: You probably were using the default LZMA compression algorithm. The beauty of 7zip is that it supports multiple compression algorithms.

For one of the absolute best text compression algorithms, try ppm[1] algorithm. We used it to compress HUGE firewall logfiles into quite manageable sizes:

nice -n 20 /usr/bin/7za a -mx9 -m0=ppmd foo.ppm.7z foo.log

[1] http://en.wikipedia.org/wiki/Prediction_by_partial_matching

wrote on Jan. 13, 2011, 7:09 p.m.

Depends what you compress. On a very repetitive bunch of 21600 48 kiB XML files I've been using at work, for example, I get a 515 kiB archive: a compression ratio of about 2000:1. Don't remember the exact numbers, but at one point I compared the compression ratios of gzip, bzip2 and lzma on this type of content and found them logarithmically roughly equidistant by a factor of 10 or so.