With Archlinux32 reaching some terabytes of data to backup I needed something “modern”, like a tape. Now, big tapes like LTO-8 are close to unaffordable, LTO-4 drives and tapes on the other hand can be aqcuired on the cheap. They get thrown out of servers rooms at the moment.
An LTO-4 tape can take 800 GB uncompressed data and the drive can be bought on Ebay for 200 to 300 CHF. Media is affordable at ca. 40 CHF per tape.
My first drive I ordered was advertised as working, which proved to be more the kind of drive only able to produce squealing noises and to be really hungry for tapes (and killing them). Well, my plan was to be under 1000 CHF for a backup solution, so I simply ordered a second one, keeping the first one for spare parts. Both drives are a HP Ultrium 1840.
The second drive turned out to work just fine. But now, trying to connect, it showed some issues. The tape drive comes in a noisy black box, which I definitely don’t want to run 24 hours a day. So I decided to remove the drive and squeeze it (quite literaly) into a machine.
After ordering quite the wrong tape (LTO-4 WORM, which costs more and can be written only once, but has “smartness” built in to be tamperfree, oh well), I got boxes and boxes of old tapes from Ricardo from somebody desperately trying to get rid of them. Which is cool with me. The price per tape dropped to around 20 CHF this way and I have more tapes than I could ever have wished for.
I tried several SCSI cards to connect to the drive. The drive uses the last generation of parallel SCSI, which is quite a nuisance to find cards for. Either SCSI cards are server-grade (PCI-X) or they are not fast enough. Some cards (like dedicated backup SCA host adapters) work fine in some machines, but not in others. The SCSI cables are prone to transmition errors, especially a 4 meter long external SCSI cable with 320 MHz (external or internal 68-pin LVDS) is somehow not really reliable at this high speed.
I went with a short shielded internal SCSI cable and putting the drive as close to the SCA host adapter as possible. This provided the best results.
The result looks like this:
Manual Backups and Tools
Most people nowadays don’t know anymore what the ’t' in ‘tar’ stands for - you guessed it: ‘tape archive’. :-)
There are other formats but usually the “rule-of-least-surprise” applies here, the simpler the command line parameters used and the more widespread the format, the more likely somebody else (or even you yourself) is able to actually read and restore the data.
The old magnetic tape tool is no longer available as binary package on Archlinux, but there is an AUR package ‘mt-st-git’ providing the ‘mt-st’ binary.
This tool you need to do basic operations on the tape like positioning, ejecting, setting compression levels, etc.
Some use cases
Rewind and eject
mt-st -f /dev/nst0 rewoffl
mt-st -f /dev/nst0 defcompression 0 mt-st -f /dev/nst0 compression 0 mt-st -f /dev/nst0 rewind tar -cvf /dev/nst0 /dev/null mt-st -f /dev/nst0 rewind
I’m disabling compression on the tapes for several reasons:
- with compression on I’m not able to deliver enough data, resuling in shoe-shining
- the remaining size of a tape is so much more predictable
- I have enough tapes anyway. :-)
Append to end of data
mt-st -f /dev/nst0 eom tar zcvf /dev/nst0 *
Status of the drive, current position of the tape
mt-st -f /dev/nst0 status
This tool can give you all kind of internal information like temperature, I/O errors of the drive, media information.
The first page of information serves as sort of an index, of what the drive can report:
shell> sg_logs /dev/nst0 -p 0 HP Ultrium 4-SCSI B32D Supported log pages [0x0]: 0x00 Supported log pages [sp] 0x02 Write error [we] 0x03 Read error [re] 0x0c Sequential access device [sad] 0x0d Temperature [temp] 0x11 DT Device status [dtds] 0x12 Tape alert response [tar] 0x13 Requested recovery [rr] 0x18 Protocol specific port [psp] 0x2e Tape alert [ta] 0x30 Tape usage (lto-5, 6) [tu_] 0x31 Tape capacity (lto-5, 6) [tc_] 0x32 Data compression (lto-5) [dc_] 0x33 Write errors (lto-5) [we_] 0x34 Read forward errors (lto-5) [rfe_] 0x35 DT Device Error (lto-5, 6) [dtde_] 0x3e Device Status (lto-5, 6) [ds_]
For instance I can get the temperature of the drive with:
shell> sg_logs /dev/nst0 -p 13 HP Ultrium 4-SCSI B32D Temperature page [0xd] Current temperature = 47 C Reference temperature = <not available>
This could be meshed into a nagios check script, checking the sanity of the drive, but then I have to manually unmount the tape pool in bacula-sd before each check.
socat is like netcat and more. It allows to build tunnels between machines, so that the ‘tar’ command cat pack files on one machine and send them to another machine, where the tape write command is attached to a listening socat.
# on the machine with the files to backup tar cvf - * | socat - TCP4:<server_with_tape>:8080 # on the machine where the tape is socat TCP4-LISTEN:8080 - | dd of=/dev/nst0 bs=10240 status=progress
If using dd I set the blocksize manually to 20*512=10240, this seems to be the standard blocksize of ‘tar’ on Linux.
Writting directly to the tape has some drawbacks as the tape drive is very fast and you cannot deliver data fast enough over a 1GBit/s network. So here ‘mbuffer’ helps to at least buffer data for some time and then flash it in one burst to the tape drive. This avoids the dreadful “shoe-shining” which not only drives you crazy (the sounds of it), but also reduces the lifetime of the components (or at least of the mechanics of the tape drive):
tar cvf - * | mbuffer -m 2G -P100% | \ socat - TCP4:<server_with_tape>:8080 socat TCP4-LISTEN:8080 - | mbuffer -m 2G -P100% | \ dd of=/dev/nst0 bs=10240 status=progress
Buffering on either side is possible, not sure if having a buffer on both sides improves anything.
I did a full backup of everything onto 10 tapes with the ‘tar/socat/mbuffer/dd’ method.
This is data which is quite stable and never changes, so I’ll just keep it on some tapes with the write protection label on. It doesn’t make much sense to put them into a bacula job, as the retention period is basically 30 years or so - or till the tape dies.
The index of the tape is a simple text file, noting the kind of data, the size, the tape number, the file number (offset on tape) and the date of the backup:
doc 946M 1 0 17.4.2021 Attic 13G 1 1 17.4.2021 bilder 19G 1 2 17.4.2021 projects 29G 1 3 18.4.2021 ARCHIVE 16G 1 5 18.4.2021 BACKUPS 122G 1 6 18.4.2021 ... music 154G 4 0 19.4.2021 movies part1 547G 5 0 19.4.2021 movies part2 785G 6 0 20.4.2021
I use bacula for the daily incremental and full backups now for tape and offline cloud storage.
bacula-sd just works fine and integrates with the rest of my backup system (the master bacula-dir is still living on an old Raspberry Pi). The only thing I was missing is to be able to copy a bacula job to two different media, one being the remote cloud storage and the other one the tape. Sort of a bacula ‘tee’ would be nice to have.
- http://cdrtools.sourceforge.net/private/portability-of-tar-features.html: on tar formats and compatibility
- https://copyconstruct.medium.com/socat-29453e9fc8a6: blog about socat