Offsite Backup with Bacula

Intro
Into the category of “What can I possibly do useful with a Raspberry PI?” falls the idea of origanizing a backup server using the Bacula backup software.
Details
I use an old USB disk and a USB hub as external storage for the Bacula volume data and for the catalog (stored in PostgreSQL).
For full-filling the off-site requirement the jobs are copied with a ‘Migration Job’ to an external FTP server. Encryption is simply done with the ‘openssl’ command line tool.
So, we go with two pool definitons:
# File Pool definition
Pool {
Name = File
Pool Type = Backup
Recycle = yes
AutoPrune = yes
Volume Retention = 1 week
Maximum Volume Bytes = 512M
Volume Use Duration = 23 hours
Use Volume Once = yes
Maximum Volumes = 9999
LabelFormat = "Vol"
Storage = File
Next Pool = External
Action On Purge = Truncate
}
# External Pool on FTP server
Pool {
Name = External
Pool Type = Backup
Recycle = yes
AutoPrune = yes
Volume Retention = 2 months
Maximum Volume Bytes = 512M
Use Volume Once = yes
Maximum Volumes = 9999
LabelFormat = "ExternalStorage"
Storage = External
Action On Purge = Truncate
}
The storage daemon stores the files in two locations:
Device {
Name = File
Media Type = File
Archive Device = /data/work/bacula/files
LabelMedia = yes
Random Access = yes
Removable Media = no
Random Access = yes
AlwaysOpen = no
}
Device {
Name = External
Media Type = File
Archive Device = /data/work/bacula/spool
LabelMedia = yes
Random Access = yes
Removable Media = no
Random Access = yes
AlwaysOpen = no
}
All jobs write to the File
Pool. At the end of every day a migration
jobs picks up those volumes and executes a script dealing with encryption
and FTP transfer:
Job {
Name = "MigrationJob"
Type = Migrate
Level = Full
Client = myserver-fd
Schedule = "MigrationAfterBackup"
FileSet = "Full Set"
Messages = Standard
Pool = File
Maximum Concurrent Jobs = 1
Selection Type = Volume
Selection Pattern = "Vol.*"
RunScript {
Command = "/etc/bacula/scripts/ExternationMigration.sh"
RunsWhen = After
RunsOnClient = no
RunsOnSuccess = yes
RunsOnFailure = no
}
}
The script itself looks as follows:
#!/bin/sh
case "$0" in
/*)
base=`dirname $0`
;;
*)
base=`pwd`/`dirname $0`
;;
esac
. $base/ftp.inc
FTP_SERVER=backupserver.somewhere.net
FTP_USER=useruser
FTP_PASS=passpass
BACULADIR=/data/work/bacula
SPOOLDIR=${BACULADIR}/spool
STATEDIR=${BACULADIR}/External
STATUSFILE_BACKUP=${STATEDIR}/state.backup
STATUSFILE=${STATEDIR}/state
if test "$(ls -A $SPOOLDIR 2>/dev/null)" != ""; then
for file in `ls ${SPOOLDIR}/ExternalStorage* | grep -v .enc`; do
if test ! -f $file.enc; then
cat $file | \
openssl enc -aes-256-cbc -salt \
-pass file:/etc/bacula/private/pwd > \
$file.enc
if test $? -ne 0; then
echo "--- ERROR: Error while encrypting volume '$file' (Check manually!)" 1>&2
exit 1
fi
fi
done
global_lock
for file in ${SPOOLDIR}/ExternalStorage*.enc; do
upload_file $file
rm -f $file
origfile=`echo $file | sed 's/\.enc$//g'`
rm -f $origfile
done
global_unlock
else
echo "--- WARN: Nothing found to transfer? Probably ok.." 1>&2
fi
exit 0
The whole ftp transfer logic script is left out (too long), but basically
it deals with setting the password in a .netrc
file, writes FTP job files
and executes ftp
. It also performs some checking after transfers to handle
transfer errors or out-of-disk-space situations.
Conclusion
This backup works reliably and fast, even a Raspberry B+ is fast
enough to deal with the encryption of some gigabytes of data per day.
The only drawback is restoring the data: you have to transfer the files
back manually via FTP, then call ‘openssl’ to decrypt them and leave them
in the /data/work/bacula/spool
directory and wait for the bacula-sd
daemon to pick them up.
This backup works now for 2 years reliably, before it run on different hardware, but also 3-4 years.
Theory
Let’s see if we follow good practice:
- have a backup: tick
- have a restore: a backup is only a backup, when a restore has been done and the data is the same after checking for differences. tick
- follow the 3-2-1 rule: 3 backups, 2 different types of media, 1 remote location (offline and/or offsite): As I am also using several home directories, one location on the NAS and one location off-site, the ‘3’ and the ‘1’ part are fullfilled. What about the ‘2’? Different media is maybe no longer valid nowadays. For things like git repositories I follow the somewhat modified ‘2’ format version of keeping two different backup formats (in this case a raw workspace with a local .git directory and an export from the server). tick
- have a fallback: actually occasionally I’m not trusting my current strategy and I have a manual backup in a tarfile onto a CD-ROM. Just in case. :-) tick