SafeDrive How-To

SafeDrive is a research prototype that detects and recovers memory-safety errors in Linux drivers, with the help of a custom source-to-source compiler called Deputy and a small kernel patch. This file covers the first steps to try the SafeDrive prototype for OSDI '06.

If you have any questions, please contact Feng Zhou

Contents

Setup
Using SafeDrive-enabled drivers
Add SafeDrive support for more drivers

Setup

You need the following files to setup SafeDrive.

  1. The Linux 2.6.15.5 kernel.
  2. Objective Caml 3.09 or above.
  3. safedrive-060912 snapshot.
  4. deputy-060912 snapshot.
  5. (Optional) ctswifi-060912 (our compile-time fault injection tool) snapshot.
Here are the steps to setup safedrive and its dependencies.
  1. Install OCaml following its instructions.
  2. Extract Linux kernel source, safedrive, deputy and ctswifi. Below we assume these are extracted in $BASE/linux, $BASE/safedrive, $BASE/deputy and $BASE/ctswifi respectively.
  3. Apply the ctswifi patch to Deputy if you want to try fault injection,
    cd $BASE/deputy
    patch -p1 < ../ctwifi/ctswifi-deputy.patch
    # make a link to ctswifi code in deputy dir
    ln -s ../ctswifi ctswifi
    
  4. Configure and compile Deputy.
    cd $BASE/deputy
    # configure Deputy with linux kernel support
    # note that the path after 'with-linux' has to be a full absolute path
    ./configure --with-linux=$BASE/linux
    make
    ( now add $BASE/deputy/bin to your path )
    
  5. Apply the SafeDrive kernel patch (including the required KGDB patches),
    cd $BASE/linux
    for f in `cat ../safedrive/patches/series`; do patch -p1 < ../safedrive/patches/$f; done
    
  6. Configure the kernel and build.
    make menuconfig
    ( now choose the config for your machine
      turn on "Device Drivers -> Generic Driver Options -> Failure Recovery for Deputy-enabled drivers" (default on)
      optionally turn on "Kernel hacking -> KGDB: ..." if you want to debug your kernel using KGDB.)
    make
    
    Now install the new kernel.
Now all necessary files are setup properly. You can go on to compile some SafeDrive-enabled drivers!

Using SafeDrive-enabled drivers

The safedrive snapshot comes with a few driver already patched to use SafeDrive. Let's try e1000 first.
 
cd $BASE/safedrive/e1000
export KERNELDIR=$BASE/linux
make

If everything goes well, the compiled module will be e1000_kr.ko. Now with the machine running the newly-compiled kernel, you can use the module by (assuming a Redhat/Fedora Core system),

# Remove existing driver module
rmmod e1000
insmod e1000_kr.ko
/etc/init.d/network start
Now the driver should work normally. There are a couple of ways to trigger failures in the driver so that you can test the recovery funcitonality of SafeDrive. First this driver is already modified such that when you set its debug message level to 1000, it will trigger a fault (by calling kr_trigger_fault()). Thus doing the following to try it,
echo "r 1" > /proc/krecover    # turn on safedrive recovery
ethtool -s eth0 msglvl 1000    # triggers recovery
(now the driver will trigger failure and be unloaded by safedrive)
tail /var/log/message          # shows log messages related to this recovery
The current recovery implementation completely unloads the driver after a failure. So you need to re-insmod the driver once a failure happens. Note also that recovery is automatically turned off after a driver fails. So you need to execute echo "r 1" > /proc/krecover again before triggering another failure. cat /proc/krecover will show some stats about safedrive.

The method above do not actually do bad things in the driver. For more realistic experiments (as done in the paper), you can recompile the driver with injected faults, e.g. with certain out-of-bound accesses.

cd $BASE/safedrive/e1000
make clean
make FAULT=scanoverrun SEED=1
Then you can load the recompiled driver and exercise it to see if any recovery is triggered. It may or may not fail, depending on whether the injected failure actually causes any memory-safety problem. You can see a list of possible failures to inject with deputy --help (in help for --faultscan.

Another way to trigger failures is to specify the driver to fail after a fixed number of Deputy checks,

echo 'r 1' > /proc/krecover    # turn on recovery due to Deputy assertion failures
echo 'q 10000' > /proc/krecover # trigger fault after 10000 checks
cat /proc/krecover             # will show how many checks have been counted

The last fault injection method is to fault on a specific code address. This is done through binary modification (using kprobe).

echo 'r 1' > /proc/krecover
echo 'f _e1000_set_msglevel 0' > /proc/krecover    # fault on entrance to _e1000_set_msglevel()
ethtool -s eth0 msglvl 10      # This should trigger the fault

Add SafeDrive support for more drivers

Roughly speaking, three steps are needed to add SafeDrive support to a driver,
  1. Set up the build process
  2. Add necessary Deputy annotations to the driver and any related kernel header files
  3. Add update tracking wrappers to kernel header files
  4. Add entrance stubs to the driver.
Set up the build process. You can use $BASE/safedrive/e1000/Makefile as a template for driver makefile. The main changes from a normal driver makefile are,
  1. Use deputy as a substitute for gcc
  2. Use kernel header files from $BASE/deputy/include
  3. Compile with -DKRECOVER to turn on SafeDrive
Adding Deputy annotations.After setting up the makefile, do a make and there will probably be Deputy compilation errors. Before a detailed Deputy tutorial materializes, you can read the OSDI'06 paper for Deputy concepts, and refer to the Deputy Quick Reference in $BASE/deputy/doc/quickref.html for a list of Deputy annotations you can use to resolve these errors.

Sometimes kernel headers need to be changed. These are handled by first copying the corresponding header file to $BASE/safedrive/include/... if the file is not already there, and then modifying the header there.

Add update tracking wrappers. Kernel updates that should be undone at recovery time should be tracked. There can often be done by wrapping the kernel API calls in the headers. There are already many updates tracked. If your need more (knowing what is needed requires understanding of how the kernel works!), you can do so by copying over the related header file to $BASE/safedrive/include/.. and do the modification there,

  1. Define a new resource type constant at the beginning of $KERNELDIR/include/linux/krecover.h.
  2. Add kr_add_object(), kr_remove_object() to the corresponding functions headers in $BASE/safedrive/include. A typical wrapper takes the following form,
    inline static int kr_pci_enable_device(struct pci_dev *dev) {
            int r = pci_enable_device(dev);
            if (!r)
                    kr_add_object(KR_PCI_ENABLE_DEVICE, dev, 0);
            return r;
    }
    #undef pci_enable_device
    #define pci_enable_device(x) kr_pci_enable_device(x)
    
    This way, the code calling pci_enable_device() is actually calling kr_pci_enable_device(), which tracks the update when it is successful.
  3. Add the corresponding compensation function to kr_release() in $KERNELDIR/drivers/base/krecover.c. This is used to free the resource at recovery time.

Add entrance wrapper to the driver. Add kr_enter_driver(failure_return_value), kr_exit_driver() to the beginning and end of each interface function of the driver, i.e. functions directly called by the kernel. Admittedly this is tedious. Hopefully we will have a tool to generate these automatically. The template to use is (note the TRUSTEDBLOCK annotation on the wrapper function):

/* original function */
static int 
_e1000_probe(struct pci_dev *pdev,
            const struct pci_device_id *ent) {
        ...
}
/* entrance wrapper function */
static int 
e1000_probe(struct pci_dev *pdev,
            const struct pci_device_id *ent)
{ TRUSTEDBLOCK
        int r;
        kr_enter_driver_rec(-EIO);
        r = _e1000_probe(pdev, ent);
        kr_exit_driver();
        return r;
}


Last modified: $Id: safedrive_howto.html,v 1.2 2006-09-13 05:55:30 zf Exp $