BLCR Administrator's Guide

This guide describes how to install, configure, and maintain Berkeley Checkpoint/Restart for Linux.

System Requirements

BLCR consists of two kernel modules, some user-level libraries, and several command-line executables. No kernel patching is required!

BLCR has been engineered to work with a wide range of Linux kernels:

BLCR uses assembly code to save some program state (most notably the CPU registers). This means that the BLCR kernel modules are not portable across CPU architectures "out of the box". Currently only x86-based systems work with BLCR. Support for other architectures is planned (most notably the Opteron). Porting BLCR to a different CPU is not a large software effort for those with kernel experience and knowledge of the target CPU's instructions. Please contact us if you are interested in contributing a port.

Installing/Configuring BLCR

To build checkpoint/restart, you need the following files:
Note for Red Hat users
Under Red Hat, this means you must install the kernel RPM and the kernel-source RPMs appropriate for the kernel you will use CR with. For example, to build checkpoint/restart for a Red Hat 2.4.20-20.9 kernel, you need the kernel-source-2.4.20-20.9 RPM, and the kernel-2.4.20-20.9 RPM. The configure script will find the correct files automatically, so you don't need to pass any additional arguments to configure.

Configuring BLCR

BLCR builds and installs much like any other autotools-based distribution:
    tar zxvf blcr-X.Y.Z.tar.gz
    cd blcr-X.Y.Z
    ./configure [ options ]
    make
    make install
Depending on which kernel you are building against, and where you wish to put the BLCR libraries, there are a number of options to configure that you need to consider.

Choosing an installation directory

By default BLCR will install into /usr/local. To choose a different directory tree to install into, pass the '--prefix' flag to configure:

Building against a kernel other than the one that's running

By default, BLCR builds against the kernel that is running on the system at configure time, and looks in a number of standard locations (/usr/src/linux, etc.) for the above files that correspond to it. If you're building checkpoint/restart for a kernel other than the kernel that is running at the time of the build (or if the source for the running kernel are in non-standard locations), you'll need to pass configure the following options:

You'll also need to pass one of the following two options:

Compiling BLCR

Just type 'make':
    % make

Installing BLCR

Use the standard 'install' make target to install the BLCR utilities and libraries, and to place the kernel modules in the standard location for your kernel:
    % make install

Loading the Kernel Modules

Before you can checkpoint/restart applications, the kernel modules need to be loaded into your kernel. The kernel modules are placed into a subdirectory of the lib/blcr branch of the installation directory. In this example, we'll assume the installation prefix was the default /usr/local and that your kernel is version 2.4.10pcp2. Thus, for this example the kernel modules are in the directory /usr/local/lib/blcr/2.4.20pcp2/. There are two kernel modules in this directory. They must both be loaded (in the correct order) for BLCR to function.

Load the kernel modules in this order:

    insmod /usr/local/lib/blcr/2.4.20pcp2/vmadump.o
    insmod /usr/local/lib/blcr/2.4.20pcp2/cr.o

You may wish to set up your system to load these modules by default at boot time. The exact mechanism for doing so differs between Linux distributions, and thus requires an experienced system administrator. However, a template init script is provided as etc/blcr.rc in the BLCR source directory.

Configuring Users' environments

Finally, you may wish to add the appropriate BLCR directories to the default $PATH $LD_LIBRARY_PATH , and $MANPATH for your users (via modifying the /etc/profile and/or '/etc/cshrc' files, or by providing modules that accomplish the same thing):

For Bourne-style shells:

    PATH=$PATH:PREFIX/bin
    MANPATH=$MANPATH:PREFIX/man
    LD_LIBRARY_PATH=$LD_LIBRARY_PATH:PREFIX/lib
    export PATH MANPATH LD_LIBRARY_PATH
For csh-style shells:
    setenv PATH ${PATH}:PREFIX/bin
    setenv MANPATH ${MANPATH}:PREFIX/man
    setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:PREFIX/lib

Making RPMs from the BLCR sources

An alternate way to install BLCR is to build a binary RPM for your system, which you can then install. This has certain advantages (such as making upgrading easier, especially if you maintain BLCR on multiple systems).

Building binary RPMs from the source tarball

The simplest method for building RPMs is to just
    % make rpms
after configure. If successful, the new RPM packages will be in the rpm/RPMS subdirectory of the build tree. The resulting packages will be for whatever kernel you configured for.

Building a binary RPM from source RPMS

You may also with start from a source RPM (blcr-X.Y.Z-N.src.rpm) rather than the tar.gz version of the BLCR distribution. Source RPMs are available on our website. These source RPMs are configured to build for the running kernel. Alternatively, the make rpms step above will create a source RPM in the rpm/SRPMS subdirectory of the build tree, valid for the configured kernel.

To build binary RPMs from the source RPM, use

    % rpm --rebuild blcr-X.Y.Z-N.src.rpm
(Make sure to use the correct filename for your source RPM). See the documentation for rpmbuild for more information. The RPM should build. You should find the binary RPMs in /usr/src/redhat/RPMS if you are root, or in some other configured location otherwise. rpmbuild prints the locations in the last few lines of output. You should see something like this:
    Wrote: /usr/src/redhat/SRPMS/blcr-0.3.0-1.src.rpm
    Wrote: /usr/src/redhat/RPMS/i686/blcr-0.3.0-1.i686.rpm
    Wrote: /usr/src/redhat/RPMS/i686/blcr-libs-0.3.0-1.i686.rpm
    Wrote: /usr/src/redhat/RPMS/i686/blcr-devel-0.3.0-1.i686.rpm
    Wrote: /usr/src/redhat/RPMS/i686/blcr-modules_2.4.20_24.9-0.3.0-1.i686.rpm


For more information

For more information on Checkpoint/Restart for Linux, visit the project home page: http://ftg.lbl.gov/checkpoint

For more information on LAM/MPI, see the LAM/MPI Documentation.