    Overview of DMTCP (Distributed MultiThreaded Checkpointing)

Concepts:
DMTCP Checkpoint/Restart allows one to transparently checkpoint to disk
a distributed computation.  It works under Linux, with no modifications
to the Linux kernel nor to the application binaries.  It can be used by
unprivileged users (no root privilege needed).  One can later restart
from a checkpoint, or even migrate the processes by moving the checkpoint
files to another host prior to restarting.

A DMTCP coordinator process is started on one host.  Application binaries
are started under the dmtcp_checkpoint command, causing them to connect
to the coordinator upon startup.  As threads are spawned, child processes
are forked, remote processes are spawned via ssh, libraries are dynamically
loaded, DMTCP transparently and automatically tracks them.


To run a program with checkpointing:
  1) Run dmtcp_coordinator in a separate terminal/window

        ./bin/dmtcp_coordinator

  2) In separate terminal(s), replace each command(s)
     with "dmtcp_checkpoint [command]"

        ./bin/dmtcp_checkpoint ./a.out

  3) To checkpoint, type 'c'<return> into dmtcp_coordinator


[ In dmtcp_coordinator window:
    h<return> for help,
    c<return> for checkpoint,
    l<return> for list of processes to be checkpointed,
    k<return> to kill processes to be checkpointed,
    q<return> to kill processes to be checkpointed and quit the coordinator.]


  4) RESTART:
    Creating a checkpoint causes the dmtcp_coordinator to write
    a script, dmtcp_restart_script.sh, along with a
    checkpoint file (file type: .dmtcp) for each client process.
    The simplest way to restart a previously checkpointed computation is:
    [ Edit ./dmtcp_restart_script.sh ]
        ./bin/dmtcp_restart_script.sh
    [ Alternatively, if all processes were on the same processor,
        and there were no .dmtcp files prior to this checkpoint: ]
        ./bin/dmtcp_restart ckpt_*.dmtcp

============================================
CONVENIENCE COMMANDS AND DEBUGGING RESTARTED PROCESSES:
  # Automatically start a coordinator in background
  ./bin/dmtcp_checkpoint ./a.out &
  # Checkpoint all processes of the default coordinator
  ./bin/dmtcp_command --checkpoint
  # Kill a.out, and optionally kill coordinator process
  ./bin/dmtcp_command --quit
  # Restart directly from local checkpoint images (.dmtcp files)
  # (Be sure there are no old ckpt_a.out_*.dmtcp files.
  #  Ensure that the restarted process is running, and not suspended.)
  ./bin/dmtcp_restart ckpt_a.out_*.dmtcp &
  # Have gdb attach to a restarted process, and debug
  # NOTE:  You must specify 'mtcp_restart', not 'dmtcp_restart'
  gdb ./a.out `pgrep -n mtcp_restart`
  # force a.out to exit any low level libraries and return to a known location
  # set a breakpoint on a common function and continue:
  (gdb) break write   
  (gdb) continue

============================================
COMMAND-LINE OPTIONS:
    `dmtcp_checkpoint', `dmtcp_command', and 'dmtcp_restart' print
their options when run with no command-line arguments.  `dmtcp_coordinator'
offers help when run (Type 'h<return>' for help.).

============================================
OPTIONS THROUGH ENVIRONMENT VARIABLES:
  dmtcp_coordinator:
    DMTCP_CHECKPOINT_INTERVAL=<time in seconds> (default: 0, disabled)
    DMTCP_PORT=<coordinator listener port> (default: 7779)
    DMTCP_CHECKPOINT_DIR=<where restart script is written> (default: ./)
    DMTCP_TMPDIR=<where temporary files are written>
					 (default: env var TMPDIR or /tmp)

  dmtcp_checkpoint / dmtcp_restart:
    DMTCP_HOST=<hostname where coordinator is running> (default: localhost)
    DMTCP_PORT=<coordinator listener port> (default: 7779)
    DMTCP_GZIP=<0: disable compression of checkpoint image>
					 (default: 1, compression enabled)
    DMTCP_CHECKPOINT_DIR=<location to store checkpoints> (default: ./)
    DMTCP_SIGCKPT=<internal signal number> (default: 12 = SIGUSR2)
    DMTCP_TMPDIR=<where temporary files are written>
					 (default: env var TMPDIR or /tmp)

  dmtcp_command:
    DMTCP_HOST=<hostname where coordinator is running> (default: localhost)
    DMTCP_PORT=<coordinator listener port> (default: 7779)

 Application-defined hook functions: called by MTCP/DMTCP if defined 
   User code containing these functions must be compiled with
	 -Wl,-export-dynamic under gcc/g++ or they won't be invoked.
   void mtcpHookPreCheckpoint(void);
   void mtcpHookPostCheckpoint(void);
   void mtcpHookRestart(void);

============================================

SHORT NOTES:

    1. A restarted process sees the shared libraries and environment variables
       that existed prior to checkpoint.  These are contained in the .dmtcp
       checkpoint file.
    2. At restart time, one can choose either to use the original
       dmtcp_coordinator or else to start a new coordinator.  Each process
       restarted by the dmtcp_restart command needs to know the host and port
       used by dmtcp_coordinator.  These default to localhost and port 7779.
       The coordinator can be specified to use port 0, in which case the
       coordinator chooses arbitrary port, and prints it to stdout.
       Setting DMTCP_PORT in the environment seen by the four main commands
       (dmtcp_coordinator, dmtcp_checkpoint, dmtcp_restart and dmtcp_command)
       will override the default port.  Similarly, setting DMTCP_HOST for
       dmtcp_checkpoint and dmtcp_restart is needed if they start on
       a different host than that of the coordinator.
    3. In order to enable various types of debugging, do:
         A. To enable debug statements for DMTCP only (related to multi-process
            communication), configure with: ./configure --enable-debug
            (or './configure --help', in general)
            --enable-debug both prints to stderr and writes files. This both
            prints to stderr and writes files $DMTCP_TMPIDR/jassert.log.XXX
            where XXX is the pid of a process (checkpoint coordinator or
            application process). In reading this, it's useful to know that
            DMTCP sets up barriers so that all processes proceed to the
            following states together during checkpoint: RUNNING, SUSPENDED,
            LOCKED, DRAINED, CHECKPOINTED, REFILLED.
            - Currently DMTCP will not work with --enable-debug and
              --enable-pid-virtualization. This will be fixed.
         B. To enable debug statements from MTCP (single-process component),
            do: In mtcp/Makefile, uncomment the line:
            CFLAGS = -O0 -g -DDEBUG -DTIMING -Wall
            Also, comment out the line: CFLAGS = -O0 -g
            Then (cd mtcp; make clean; make)
	 C. If debugging MTCP from within DMTCP, then:
	      a.  uncomment the line:
		 CFLAGS = -DDMTCP=1 -O0 -g -DDEBUG -DTIMING -Wall
	      b.  set DUP_STDERR_FD in mtcp/mtcp_printf.c to 826 or 827
		  depending on whether MTCP debug output should go to
		  stdout or $DMTCP_TMPDIR/jassertlog.*
              c.  Then (cd mtcp; make clean; make)
	 D. If debugging:  dmtcp_checkpoint a.out
	    and you wish to attach to a.out when it starts, then
	      a.  Set the environment variable MTCP_INIT_PAUSE
			mtcp/mtcp.c:mtcp_init() will pause 15 seconds.
	      b.  dmtcp_checkpoint a.out &
	      c.  gdb a.out `pgrep -n a.out`  [During the 15 second pause.]
	      d.  The usual gdb commands should be available to debug
			a.out/libmtcp.so/dmtcphijack.so
    4. It often works to migrate processes by moving the checkpoint files to
       another host and editing dmtcp_checkpoint_restart.sh prior to
       restarting.  Whether it works is affected by how different are the
       corresponding versions for the kernel and glibc.
    5. Checkpoint is implemented by sending a signal to each user thread.
       As with all well-written code, your system calls should be prepared
       for an error return of EINTR (interrupted, due to a simultaneous
       checkpoint invocation or other kernel activity), in which case you
       can call the system call again.
    6. See comment in code before mtcp/mtcp_restart_nolibc.c:readmemoryareas()
       for specific handling of mapping of memory objects via mmap:
         MAP_PRIVATE, MAP_SHARED, MAP_ANONYMOUS
       Heuristically, if a memory area is mapped to a file for which user has
       only read permission, then a restarted process uses the most recent
       file. If a memory area is mapped to a file with write or execute
       permission, the pre-checkpoint memory contents is copied back into
       memory area.
    7. If your application has only a single process (single- or
       multi-threaded), then you can also directly use the software in
       the MTCP subdirectory.  You may not need the generality of DMTCP.
    8. For developers, mtcp/readmtcp is useful for debugging checkpoint
       images.  Run it without arguments for a usage message.
    9. dmtcpaware exists for programs that wish to directly talk to the
       dmtcp_coordinator, without the intervention of a human being.
       See the test subdirectory for several example dmtcpaware programs.
    10. bin/gdb-add-symbol-file may be a useful debugging tool.  It computes
	the arguments for the add-symbol-file command of gdb, to import
	symbol information about a dynamic library.  It is most useful in
	combination with *-dbg Linux packages and prefix to dmtcp_checkpoint:
	  env LD_LIBRARY_PATH=/usr/lib/debug dmtcp_checkpoint ...
	followed by 'attach' in gdb.
    11. The particular combination of Ubuntu 8.10 (Intrepid) and
	 g++-4.3 with the default -O2 flag appears to have a bug.
         (g++-4.3 decides to inline a certain function, and then fails
	 to compile because it decides that that function cannot be inlined.)  
	There are two easy workarounds.  First, if make fails, then manually
	re-execute the g++ compilation from the last line output by make,
	but do not include "-O2".  Alternatively, when configuring, use:
	  env CXXFLAGS=-O1 ./configure
    12. A. Matlab should be invoked without graphics, and to be extra safe,
	   without the JVM.  The -nodisplay and -nojvm matlab flags suffice:
		bin/dmtcp_checkpoint matlab -nodisplay -nojvm
	B.  Older releases of Matlab (e.g. release 7.4) have several issues:
	    If you see a message about GLIBCXX-3.4 not found, you can either
	    use root privilege to replace matlab's older libstdc++  by your
	    system's newer libstdc++ (back up matlab's older libstdc++),
	    or else just re-configure and re-compile DMTCP as follows:
		env CC=gcc-4.1 CXX=g++-4.1 ./configure
		[ Also, modify mtcp/Makefile to use: CC=gcc-4.1 ]
                make clean; make
	C.  While we believe that with the DMTCP-1.10 release, directly
	    checkpointing the standard matlab script should now be reliable,
	    we are continuing to do stress testing.  If you encounter any
	    unreliability please notify the developers.  As a workaround, you
	    can always revert to the older method of directly running
	    the MATLAB binary.  To run the MATLAB binary, do:
	      bash -x `which matlab` -nodisplay
	    On our computer, just before the 'matlab' banner, we see the
	    path of  MATLAB.  Then, we can invokd it as:
	      dmtcp_checkpoint /opt/matlab/bin/glnxa64/MATLAB -nodisplay -nojvm

    13. How to checkpoint OpenMPI with DMTCP
        Verify that mpirun works.
        Verify dmtcp_{checkpoint,restart} commands are in your path:
          ssh <REMOTE-HOST> dmtcp_checkpoint --help
	If they are not in your path, adjust your shell initialization file
           to extend your path.
        Verify "ssh <REMOTE-HOST>" works without password otherwise
        do the following:
          ssh-keygen -t dsa       [accept default values]
          ssh-keygen -t rsa       [accept default values]
          cat ~/.ssh/id*.pub >> ~/.ssh/authorized_keys

        make clean
        make
        make check

        dmtcp_checkpoint mpirun ./hello_mpi
        dmtcp_command --checkpoint

        ./dmtcp_restart_script.sh

        DMTCP uses SIGUSR2 as default and so do older versions of OpenMPI.
        If you have an older version (e.g < 1.3), try choosing a different
        value of SIGNUM for DMTCP as follows:
        dmtcp_checkpoint --mtcp-checkpoint-signal <SIGNUM> mpirun ./hello_mpi

    14. Using DMTCP with X-Windows:
	Note that this method does not work with X extensions like OpenGL.
	If someone wishes to extend this method to OpenGL, we have some
	ideas for an approach that we can share.  Also, this method does
	not currently successfully checkpoint an xterm, for reasons that
	we do not fully understand.  We will look further into this later
	when time and resources permit.

	Install TightVNC (either as a package from your Linux distro,
	or at:  http://www.tightvnc.com/
	with installation instructions at:
	  http://www.tightvnc.com/doc/unix/README.txt

        If the server fails to start, you may need to specify the location of
        fonts on your system. Do this by editing the "vncserver" Perl script
	(which you put in your path above).  Modify the $fontPath variable
	to point to your font directories. For example, I listed all of the
	subdirectories of /usr/share/fonts/ in the fontPath.

        The processes started up automatically by the VNC server are listed in
        the ~/.vnc/xstartup file. Use the following as your ~/.vnc/xstartup,
        where we use the blackbox window manager and an x_app application
	as an example:
           #!/bin/csh
           blackbox &
           x_app

        You should test that you can use the vncserver and vncviewer now.
        This example uses desktop number 1:
           vncserver :1
           vncviewer localhost:1
           # Kill vncviewer window manually, and then:
           vncserver -kill :1

        Make sure the executables dmtcp_checkpoint, dmtcp_coordinator,
        dmtcp_restart, and dmtcp_command are in your path.

        Note that if the VNC server is killed without using the "vncserver -kill",
        there will be some temporary files left over that prevent the server
        from restarting.  If this occurs, remove them:
           rm -rf /tmp/.X1-lock /tmp/.X11-unix/X1
        where X1 corresponds to starting the server on port 1.

        Now, start the VNC server under checkpointing control:
           dmtcp_checkpoint vncserver :1

        Use the VNC viewer to view your x_app application in the blackbox
	window manager:
           vncviewer localhost:1

        Before checkpointing, close any xterm windows.  Also, close
	the vncviewer itself.  They can be reopened again after
        the checkpoint has completed.

        [Optional] To verify that vncserver is running under checkpoint control:
           dmtcp_command h
           dmtcp_command s

        To checkpoint the VNC server, x_app, and any other processes running
	under the VNC server, remove any old checkpoint files, and type:
           rm -f ckpt_*.dmtcp
           dmtcp_command --checkpoint

        This creates a file of the form ckpt_*.dmtcp for each process being
	checkpointed.  To kill the vncviewer and restart,
	use the restart script:
           vncserver -kill :1
           # This script assumes dmtcp_restart is in your path.  If not,
           #  modify the script to replace dmtcp_restart by a full path to it.
           ./dmtcp_restart_script.sh

        Alternatively, you may prefer to directly use the dmtcp_restart command:
           vncserver -kill :1
           dmtcp_restart ckpt_*.dmtcp

        Note: if checkpointing doesn't fully complete, make sure you're not out
        of disk space, and that there are no other file system problems.

    15. For Mandriva Linux, and some others, you will need to ensure that the
	  packages 'patch', 'glibc-static', and 'linux-userspace-headers'
	  are installed in order to build DMTCP.
	For Ubunutu/Debian Linux, ensure that the
	  packages 'patch' and 'linux-libc-dev' are installed.
	For OpenSuse Linux, ensure that the
	  packages 'patch' and 'linux-kernel-headers'

    16. By default in DMTCP, successive checkpoints of the same process
	write to the same checkpoint image filename.  If you prefer that
	successive checkpoint be written to distinct filenames, then use:
	  ./configure --enable-unique-checkpoint-filenames

    17. DMTCP continues to use the original pid (process id), tid (thread id),
	etc., even on restart from a checkpoint image.  In this last
	case, your program will see the _original pid_, and not the
	current pid.  At this time, a restarted process will appear
	within the "ps" command under the program name "mtcp_restart".

    18. Certain applications, such as some shells, vim, etc., try to
	recognize mouse events from the X11 windows system.  While DMTCP
	successfully restarts these applications, it does not currently
	restore the connection to X11.  Mouse events are no longer recognized
	after restart.
