Search:  
Gentoo Wiki

HOWTO_Speed_up_decompression_with_pbzip2

This article is part of the HOWTO series.
Installation Kernel & Hardware Networks Portage Software System X Server Gaming Non-x86 Emulators Misc

Contents

What's this about?

pbzip2 is a parallel implementation of bzip2 that will allow faster compression and decompression of bzip2 (and tar.bz2?) archives. It will allow bzip2'd distfiles downloaded by portage to be decompressed quicker.

Will this work for me?

If you computer has multiple processors, then yes.

Installing pbzip2

First, we emerge app-arch/pbzip2.

emerge pbzip2

Next edit /etc/profile. Add the following two lines to the bottom and substitute N with the number of processors/cores you have in your computer.

File: /etc/profile
alias bzip2="pbzip2 -pN"
alias bunzip2="pbzip2 -pN -d"

In order to allow portage to take advantage of pbzip2, we must add the same lines to /etc/portage/bashrc. If it does not exist, create it.

File: /etc/portage/bashrc
alias bzip2="pbzip2 -pN"
alias bunzip2="pbzip2 -pN -d"

Benchmarks

Core 2 Duo

I benchmarked bzip2 versus pbzip2 on my 2 GHz Core 2 Duo laptop. All tests were done by compressing and decompressing linux-2.6.23.tar.bz2.

Decompression

bzip221.067 seconds
pbzip211.282 seconds

As you can see, by using pbzip2 on a dual core system I was able to decompress the same file in half the time. Theoretically, if you had quad cores decompression would only take around 5 seconds. Someone with a quad core please benchmark.

Decompressing non-pbzip2 Created Archives

pbzip2 can only decompress archives in parallel that have been compressed with pbzip2. For example, extracting linux-2.6.23.8.tar.bz2 as found on kernel.org with pbzip2 takes roughly twice as long on a dual core system when compared against bzip2.

[root@hoya /usr/src] grep Core /proc/cpuinfo 
  model name      : Intel(R) Core(TM)2 CPU          4400  @ 2.00GHz
  model name      : Intel(R) Core(TM)2 CPU          4400  @ 2.00GHz
[root@hoya /usr/src] /usr/bin/time -p pbzip2 -p2 -r -dc linux-2.6.23.8.tar.bz2 > /dev/null
  real 21.99
  user 21.36
  sys 0.33
[root@hoya /usr/src] /usr/bin/time -p bzip2 -dc linux-2.6.23.8.tar.bz2 > /dev/ null
  real 9.80
  user 9.70
  sys 0.03

Compression

bzip252.610 seconds
pbzip230.333 seconds

As it turns out, compression in parallel doesn't get as large of a speed up as decompression. You'll be doing more decompression anyway.

Core 2 Quad

On my Core 2 Quad Q6600 I get the following results (tested in ramdisk, linux-2.6.23.tar.bz2):

Decompression

bzip210.085 seconds
pbzip23.367 seconds

Compression

bzip248.605 seconds
pbzip213.564 seconds

References

Retrieved from "http://www.gentoo-wiki.info/HOWTO_Speed_up_decompression_with_pbzip2"

Last modified: Fri, 08 Aug 2008 01:37:00 +0000 Hits: 881