Secure, automated backups

After a fair amount of trial and error, I have finally discovered a way to back up files on one system over the Internet, to a remote system in a secure, automated way.

The end result is a cron job that uses rsync to synchronize all files in a directory tree between the source and destination systems. You could, for example, backup / from the source system to /backups/system1/ on the target system, in effect creating a full copy of the source system under a subdirectory on the target system. In fact, you could have multiple systems perform a full backup to the "master," each with their own subdirectory.

The nice part is that all files are accessible in their "native" format (vs. being stuffed deep inside an archive of some sort), and retain their ownership, permissions, etc.

Ok, let's get started. First a bit of setup, let's create a spot for all our keys, scripts and temp files to live in:

mkdir /root/securebackup
cd /root/securebackup

The next step is to create a new SSH key pair. I recommend using these keys for the unattended copy only (don't re-use them for interactive logins) as it's more secure that way:

insurgent:~>ssh-keygen -t rsa -b 2048 -f /root/securebackup/unattended.key
Generating public/private dsa key pair.
Enter passphrase (empty for no passphrase): [ ENTER PASSPHRASE ]
Enter same passphrase again: [ CONFIRM PASSPHRASE ]
Your identification has been saved in /root/securebackup/unattended.key.
Your public key has been saved in /root/securebackup/unattended.key.pub.
The key fingerprint is:
f0:ba:77:39:1f:6f:b1:76:a3:e6:cd:8f:d3:ad:18:21 mi[XXXX].ca

Now at this point you don't have to enter a passphrase for your new key if you don't want to. It makes life simpler, but less secure. I've taken the harder, more secure route, and assigned a strong passphrase to the key.

On the target machine, edit ~/.ssh/authorized_keys2 and add a line containing the public key, which is contained in /root/securebackup/unattended.key.pub. This will allow the target system to accept SSH sessions from the source system, using the newly generated key as authentication.

Now we're ready for the good stuff. In the next step, we create a simple shell script to ensure that the ssh-agent is running, and feed it the newly created key. The agent always runs in the background, and provides keys to any fledgling SSH sessions. This will allow us to have unattended SSH sessions, since the agent will be running even when nobody is logged in. Save this script as /root/securebackup/start-agent:

#!/bin/csh

# Location of temp file to store ssh-agent environment settings
set agentinfo="/root/securebackup/agent-info"

# Location of SSH key file
set key="/root/securebackup/unattended.key"

# Force newly created files to have decent permissions
umask 077

# Is the ssh-agent already running?
set isrunning=`ps -aef | grep ssh-agent | grep -v grep | wc -l`

# If so, just quit
if ( $isrunning > 0 ) then
echo Agent is already running...
exit
endif

# Otherwise, run it, and save the output to a temp script
/usr/bin/ssh-agent > $agentinfo

# Now read in the variables set in the temp script
source $agentinfo

# Add the key to the agent (password typically required)
/usr/bin/ssh-add $key

Ensure that root owns the script, and it's permissions are 700 (chown root:root /root/securebackup/start-agent ; chmod 700 /root/securebackup/start-agent). If you set a password above, you'll be prompted for it when you run the start-agent script. Note that the script is smart enough not to start two agents if one is already running. You should run this script after every system reboot. If you have it run on startup, your system boot may hang until you enter your password, so it's best to start it manually.

Ok, we're almost done. Now that the keys are generated and distributed, and the agent is started and has been loaded with the key, we need to actually back up the files. Here is the backup script itself, which you'll want to save as /root/securebackup/rsync-backup:

#!/bin/csh

# Set the destination host name
set destination="hostname.com"

# Set the current system's hostname
set hostname=`hostname`

# Set the username used to log in on the target system
set username="root"

# Set the file path to back up on the source system (could be / to back up everything)
set sourcepath="/etc/"

# Where will the files in $sourcepath above be saved on the target system?
set destpath="/sysbackup/$hostname/etc/"

# Location of temp file which contains ssh-agent environment settings (must be same as above)
set agentinfo="/root/securebackup/agent-info"

# File to record last backup date
set backupdate="/root/securebackup/backup-date"

# Load the unattended agent info
source $agentinfo

# Did the user pass in their own hostname? If so, use that instead.
if ( "$1" != "" ) then
        set destination=$1
endif

# Display the last backup timestamp
echo "Last backup date:"
/bin/cat $backupdate

# Now it's time to back up the files
/usr/bin/rsync -e ssh -avz --stats --delete-after --exclude-from=/root/securebackup/exclude-files ${sourcepath} ${username}@${destination}:${destpath}

# Save the current timestamp
/bin/date > $backupdate

exit

Once again, we want to ensure it's owned by root and has secure permissions (chown root:root /root/securebackup/rsync-backup ; chmod 700 /root/securebackup/rsync-backup).

Note the "--exclude-from" line in the rsync command. That allows us to skip files that we don't want to copy over, for whatever reason. For example, /root/securebackup/exclude-files may contain something like this, if you're backing up /:

dev/
proc/
tmp/
var/lock/
var/run/
core

See the rsync man page for more details. You'll want to either remove the --exclude-files reference or at least "touch /root/securebackup/exclude-files" if you're not going to use it, otherwise rsync will complain.

Ok, just one step left now, wich is to add a crontab entry to make the rsync-backup script run daily. To have it run at 3:00 AM (presumably when the server is relatively idle), use the following format:

00 03 * * * /root/securebackup/rsync-backup

That's it, you're done! You should get an e-mail daily from the cron subsystem detailing which files were synchronized. You'll want to test each component of course and since it's a bit of a complicated system there are a number of places for things to go wrong. If you hit a snag just leave a comment and I'll do my best to respond to it.

Enjoy!

Epilogue

Here are some additional backup notes that didn't fit anywhere else.

Differential Backup

LastBackupTime = `cat /Backup_Dir/Last.txt`
tar zcvf --newer $LastBackupTime /Backup_Dir/Date.x.tgz $DIRS
echo "Date" > /Backup_Dir/Last.txt

Partition Backup

dd if=/dev/hda1 of=/dev/hdb1 bs=1024