A - Z Index | Phone Book | Careers

Berkeley Lab Checkpoint/Restart (BLCR) for LINUX

Future Technologies Group researchers are developing a hybrid kernel/user implementation of checkpoint/restart. Their goal is to provide a robust, production quality implementation that checkpoints a wide range of applications, without requiring changes to be made to application code. This work focuses on checkpointing parallel applications that communicate through MPI, and on compatibility with the software suite produced by the SciDAC Scalable Systems Software ISIC. This work is broken down into 4 main areas:

  • Checkpoint/Restart for Linux (CR)
  • Checkpointable MPI Libraries
  • Resource Management Interface to Checkpoint/Restart
  • Development of Process Management Interfaces

News

January 29, 2013
Version 0.8.5 is now available from the Checkpoint Downloads page.
This version fixes several bugs, and extends support to kernels through 3.7.1.
January 14, 2013
Version 0.8.5 entered Beta testing in mid-December, and will support kernels through 3.7.x.  Testing of the Beta release by the user community is needed prior to a final 0.8.5 release.
October 1, 2012
We are back from a hiatus in funding and active work has resumed on BLCR.
We look forward to bringing you a new release as soon as possible.
October 11, 2011
Version 0.8.4 is now available from the Checkpoint Downloads page.
This version fixes some minor bugs, and extends support to kernels through 2.6.38.
August 16, 2011
Version 0.8.3 is now available from the Checkpoint Downloads page.
This version fixes some minor bugs, and extends support to kernels through 2.6.34.
June 16, 2009
Version 0.8.2 is now available from the Checkpoint Downloads page.
This version fixes some minor bugs, and extends support to kernels through 2.6.30.
March 25, 2009
Version 0.8.1 is now available from the Checkpoint Downloads page.
This version fixes several bugs, and extends support to kernels through 2.6.29.
March 25, 2009
Version 0.8.1 is now available from the Checkpoint Downloads page.
This version fixes several bugs, and extends support to kernels through 2.6.29.
January 12, 2009
Version 0.8.0 is now available from the Checkpoint Downloads page.
This version adds new features, fixes several bugs, and extends support to kernels through 2.6.28.
August 12, 2008
Version 0.7.3 is now available from the Checkpoint Downloads page.
This version fixes several bugs seen in 0.7.2.
July 28, 2008
Version 0.7.2 is now available from the Checkpoint Downloads page.
This version fixes several bugs seen in 0.7.1.
July 14, 2008
Version 0.7.1 has been verified to work correctly with the final 2.6.26 kernel (released yesterday).
June 25, 2008
Version 0.7.1 is now available from the Checkpoint Downloads page.
This version fixes several bugs and extends support to 2.6.26-rc7 kernels (we don't normally track the -rc kernels, but are hoping 0.7.1 will support the upcoming full 2.6.26 release).
May 30, 2008
Version 0.7.0 is now available from the Checkpoint Downloads page.
This version adds several useful features to the checkpoint and restart utilities; extends the range of supported kernels; and fixes numerous bugs. Support for PPC32 platforms is a new experimental features in this release. As previously announced, this release drops support for LinuxThreads; NPTL is now the only supported pthreads implementation. For a complete list of changes since 0.6.5, please see the NEWS file.
February 29, 2008
Version 0.6.5 is now available from the Checkpoint Downloads page.
This version fixes two potential kernel panics.
January 28, 2008
Version 0.6.4 is now available from the Checkpoint Downloads page.
This version fixes a potential kernel panic when checkpointing mmap()s of HUGETLBFS files with some kernels.
January 22, 2008
Version 0.6.3 is now available from the Checkpoint Downloads page.
This version fixes a serious floating-point corruption bug on the x86-64 platform, present in all BLCR releases since 0.4.2. Users of BLCR on the x86-64 architecture are strongly encouraged to upgrade.
January 14, 2008
Version 0.6.2 is now available from the Checkpoint Downloads page.
This version fixes significant bugs in 0.6.1 and adds support for 2.6.23 kernels (and some vendors' 2.6.22.x kernels).
September 25, 2007
Version 0.6.1 is now available from the Checkpoint Downloads page.
This version fixes minor bugs in 0.6.0.
September 10, 2007
Version 0.6.0 is now available from the Checkpoint Downloads page.
This version adds support for checkpoint/restart of memory shared via mmap(MAP_SHARED), of open unlinked files, and of pending signals; extends the range of supported kernels; greatly expands the test suite; and fixes numerous bugs. Support for PPC64 and ARM platforms, and for cross-compilation, are new experimental features in this release.
August 28, 2007
Announcing deprecated support for LinuxThreads and for Linux 2.4.X kernels
  • Starting with the 0.6.0 release, new bug reports that one cannot reproduce under NPTL + Linux 2.6.x will receive little or none of our attention. However, we will try to distribute user-contributed fixes for such bugs. Note that the 0.6.0 release is expected to pass the BLCR test-suite under LinuxThreads and/or 2.4.x kernels on the developers' x86 systems.
  • Beginning with the next "full" release (0.7.0) we will begin to remove code in BLCR that exists only to support LinuxThreads and/or Linux 2.4.x.
  • We have not yet decided the fate of support for those 2.4.x kernels which include Red Hat's backport of NPTL support (RHL9.0, RHEL, RHAS, etc.).
  • If anybody cares enough about 2.4.x and/or LinuxThreads to volunteer to take over testing and maintenance of BLCR on such platforms, let us know.
July 11, 2007
Version 0.5.6 is now available from the Checkpoint Downloads page.
This version fixes a bug that could lead to corrupted restores of data buffered in pipes. All BLCR users with kernel versions 2.6.14 or newer are strongly encouraged to upgrade.
April 27, 2007
Version 0.5.5 is now available from the Checkpoint Downloads page.
This version adds support for 2.6.21 kernels.
April 20, 2007
Version 0.5.4 is now available from the Checkpoint Downloads page.
This version fixes some problems reported in 0.5.3.
March 29, 2007
Version 0.5.3 is now available from the Checkpoint Downloads page.
This version fixes minor problems reported in 0.5.2.
March 23, 2007
Version 0.5.2 is now available from the Checkpoint Downloads page.
This version fixes minor problems reported in 0.5.0 and 0.5.1.
March 20, 2007
Version 0.5.1 is now available from the Checkpoint Downloads page.
This version adds support for newer kernel versions, including 2.6.20 and 2.6.17-5mdv, and fixes some minor problems reported in 0.5.0.
March 2, 2007
Version 0.5.0 is now available from the Checkpoint Downloads page.
This version adds support for checkpoint/restart of groups of related processes, extends the range of supported kernels, improves I/O performance, and fixes numerous bugs.
November 23, 2005
Version 0.4.2 is now available from the Checkpoint Downloads page.
This version adds support for x86_64 (Opteron/EM64T) processors, stable support for Linux 2.6, and a number of miscellaneous bugfixes.
February 18, 2005
Version 0.4.0 is now available from the Checkpoint Downloads page.
This version adds experimental support for Linux 2.6.x kernels.

Documentation

Publications

Downloads

Other Resources

Features

  • Fully SMP safe
  • Rebuilds the virtual address space and restores registers
  • Supports the NPTL implementation of POSIX threads (LinuxThreads is no longer supported)
  • Restores file descriptors, and state associated with an open file
  • Restores signal handlers, signal mask, and pending signals.
  • Restores the process ID (PID), thread group ID (TGID), parent process ID (PPID), and process tree to old state.
  • Support save and restore of groups of related processes and the pipes that connect them.
  • Should work with nearly any x86 or x86_64 Linux system that uses a 2.6 kernel (see FAQ for most recent info). Verified to work on SuSE Linux 9.x and up; Red Hat 8 and 9; Red Hat Enterprise Linux version 3, 4and 5; Fedora Core 5 through 10; and many vanilla Linux kernels (from kernel.org) from 2.6.0 on up (and many more).
  • Experimental support is present for PPC, PPC64 and ARM architures. We consider this support experimental mainly because of our limited ability to test it.
  • Xen dom0 an domU are both supported with Xen 3.1.2 or newer.
  • Tested with the GNU C library (glibc) versions 2.1 through 2.6

For more information, check these pages, or send e-mail to checkpoint-NO SPAM@lbl . gov