Offsite Backup with Bacula

Offsite Backup with Bacula

Intro

Into the category of “What can I possibly do useful with a Raspberry PI?” falls the idea of origanizing a backup server using the Bacula backup software.

Details

I use an old USB disk and a USB hub as external storage for the Bacula volume data and for the catalog (stored in PostgreSQL).

For full-filling the off-site requirement the jobs are copied with a ‘Migration Job’ to an external FTP server. Encryption is simply done with the ‘openssl’ command line tool.

So, we go with two pool definitons:

# File Pool definition
Pool {
  Name = File
  Pool Type = Backup
  Recycle = yes
  AutoPrune = yes
  Volume Retention = 1 week
  Maximum Volume Bytes = 512M
  Volume Use Duration = 23 hours
  Use Volume Once = yes
  Maximum Volumes = 9999
  LabelFormat = "Vol"
  Storage = File
  Next Pool = External
  Action On Purge = Truncate
}

# External Pool on FTP server
Pool {
  Name = External
  Pool Type = Backup
  Recycle = yes
  AutoPrune = yes
  Volume Retention = 2 months
  Maximum Volume Bytes = 512M
  Use Volume Once = yes
  Maximum Volumes = 9999  
  LabelFormat = "ExternalStorage"
  Storage = External 
  Action On Purge = Truncate
}

The storage daemon stores the files in two locations:

Device {
  Name = File
  Media Type = File
  Archive Device = /data/work/bacula/files
  LabelMedia = yes
  Random Access = yes
  Removable Media = no
  Random Access = yes
  AlwaysOpen = no
}

Device {
  Name = External
  Media Type = File
  Archive Device = /data/work/bacula/spool
  LabelMedia = yes
  Random Access = yes
  Removable Media = no
  Random Access = yes
  AlwaysOpen = no
}

All jobs write to the File Pool. At the end of every day a migration jobs picks up those volumes and executes a script dealing with encryption and FTP transfer:

Job {
  Name = "MigrationJob"
  Type = Migrate
  Level = Full
  Client = myserver-fd
  Schedule = "MigrationAfterBackup"
  FileSet = "Full Set"
  Messages = Standard
  Pool = File
  Maximum Concurrent Jobs = 1
  Selection Type = Volume
  Selection Pattern = "Vol.*"
  RunScript {
    Command = "/etc/bacula/scripts/ExternationMigration.sh"
    RunsWhen = After
    RunsOnClient = no
    RunsOnSuccess = yes
    RunsOnFailure = no
  }
}

The script itself looks as follows:

#!/bin/sh

case "$0" in
	/*)
		base=`dirname $0`
		;;

	*)
		base=`pwd`/`dirname $0`
		;;
esac

. $base/ftp.inc

FTP_SERVER=backupserver.somewhere.net
FTP_USER=useruser
FTP_PASS=passpass
BACULADIR=/data/work/bacula
SPOOLDIR=${BACULADIR}/spool
STATEDIR=${BACULADIR}/External
STATUSFILE_BACKUP=${STATEDIR}/state.backup
STATUSFILE=${STATEDIR}/state

if test "$(ls -A $SPOOLDIR 2>/dev/null)" != ""; then
	for file in `ls ${SPOOLDIR}/ExternalStorage* | grep -v .enc`; do
		if test ! -f $file.enc; then
			cat $file | \
				openssl enc -aes-256-cbc -salt \
				-pass file:/etc/bacula/private/pwd > \
				$file.enc
			if test $? -ne 0; then
				echo "--- ERROR: Error while encrypting volume '$file' (Check manually!)" 1>&2
				exit 1
			fi
		fi
	done
	
	global_lock

	for file in  ${SPOOLDIR}/ExternalStorage*.enc; do
		upload_file $file
		rm -f $file
		origfile=`echo $file | sed 's/\.enc$//g'`
		rm -f $origfile
	done
	
	global_unlock
	
else
	echo "--- WARN: Nothing found to transfer? Probably ok.." 1>&2
fi

exit 0

The whole ftp transfer logic script is left out (too long), but basically it deals with setting the password in a .netrc file, writes FTP job files and executes ftp. It also performs some checking after transfers to handle transfer errors or out-of-disk-space situations.

Conclusion

This backup works reliably and fast, even a Raspberry B+ is fast enough to deal with the encryption of some gigabytes of data per day. The only drawback is restoring the data: you have to transfer the files back manually via FTP, then call ‘openssl’ to decrypt them and leave them in the /data/work/bacula/spool directory and wait for the bacula-sd daemon to pick them up.

This backup works now for 2 years reliably, before it run on different hardware, but also 3-4 years.

Theory

Let’s see if we follow good practice:

  • have a backup: tick
  • have a restore: a backup is only a backup, when a restore has been done and the data is the same after checking for differences. tick
  • follow the 3-2-1 rule: 3 backups, 2 different types of media, 1 remote location (offline and/or offsite): As I am also using several home directories, one location on the NAS and one location off-site, the ‘3’ and the ‘1’ part are fullfilled. What about the ‘2’? Different media is maybe no longer valid nowadays. For things like git repositories I follow the somewhat modified ‘2’ format version of keeping two different backup formats (in this case a raw workspace with a local .git directory and an export from the server). tick
  • have a fallback: actually occasionally I’m not trusting my current strategy and I have a manual backup in a tarfile onto a CD-ROM. Just in case. :-) tick