Tuesday, September 15, 2009

Quick, Automated, Incremental backup with rsync and ssh.

Working in the multimedia field. I live and die by data management. It's easy to be a little to quick with the delete key. Even with the best data organization strategies, it's all for nothing when a drive fails. This is why i've set our in search of a reliable way to incrementally backup my data.

A few months back I posted a little app that I wrote using rsync and ssh that would backup all my portable media overnight when it got left plugged in. I've since found that data o cumulates very quickly without some sort of incremental organization.

Fortunately, my backup server uses a file system (ext3) which supports hard file links. Hard links make incremental backups very simple to manage as copies of the same file do not occupy additional space. After a little research, if found that with a little rsync and ssh magic, a very simple and efficient backup archive can be built.

Firstly, we have to backup something remotely. I place the most recent backup copies in a "current" folder. Also, notice bellow that the --delete switch is used in the rsync command, this will remove files that no longer exist. That may seem like a bad thing, but i'll show you how this is actually good and helps in the incremental backup scheme.

#Backup remotely using rsync and ssh
rsync -avx --delete /home/Dana/ server.home:/backup/dir/current -e ssh


Now if all we did was run this command. At best, we would just end up with a direct copy of our data, which isn't really much of a backup. But if we then make a hard-linked copy of the 'current' folder and rename it to be asociated with the date the backup was made, we quickly get an incremental backup of our data. This copy can be executed directly after the rsync job has finished.

#copy and rename the current folder
ssh server.home cp -rl /backup/dir/current /backup/dir/`date +%Y-%m-%d`


The -rl switch ensures that directories are copied and that all of the copies should be hard linked, which means that new copies will not occupy additional disk space.

Since bash, python and ssh come installed by default on OS X, this method of backup can also work wonders for the Mac crowd. And for windows