Posts tagged "backup":
How to set up rsnapshot instead of Time Machine
(This blog post was changed since my initial strategy of disabling the lockfile didn't work. Turns out, the lockfile is required, and backups have to be stacked.)
Yesterday, I wrote about how Time Machine has failed me. Time Machine keeps regular backups, going back as far as your hard drive space permits. In theory. In practice, every year or so it messes up somehow and has to start over, thereby deleting all your older backups. A backup that is not reliable is not a backup.
Luckily, there are alternatives. Probably the easiest is rsync[1], a very cool tool that copies files and directories from one place to another. You could simply run this once a day, and have a new backup every day. You can even configure rsync so it doesn't need to copy unchanged files, and instead hard-links them from an older backup. rsnapshot automates this process to keep a number of tiered copies, for example ten hourly backups, seven daily backups, four weekly backups, and a hundred monthly backups. Each backup is then simply a directory that contains your files. No fancy starfield-GUI, but utterly reliable and trivial to understand [2].
Setting up rsnapshot on macOS is not quite as straight-forward as I'd like, and I couldn't find a great guide online. So, without further ado, here's how to configure rsnapshot on macOS:
Install rsnapshot
brew install rsnapshot
Write the config file
You can copy a template from homebrew:
cp /usr/local/Cellar/rsnapshot/1.4.2/etc/rsnapshot.conf.default /usr/local/etc/rsnapshot.conf
And then configure the new configuration file to your liking (preserve the tabs!):
config_version 1.2 # default verbose 2 # default loglevel 3 # default # this is where your backups are stored: snapshot_root /Volumes/BBackup/Backups.rsnapshot/ # make sure this is writeable # prevent accidental backup corruption: lockfile /Users/bb/.rsnapshot.pid # use this if you back up to an external drive: no_create_root 1 # don't back up if the external drive is not connected # configure how many tiers of backups are created: retain hourly 10 retain daily 7 # dailies will only be created once 10 hourlies exist retain weekly 4 # weeklies will only be created once 7 dailies exist retain monthly 100 # monthlies will only be created once 4 weeklies exist # the list of directories you want to back up: backup /Users/bb/Documents localhost/ backup /Users/bb/eBooks localhost/ backup /Users/bb/Movies localhost/ backup /Users/bb/Music localhost/ backup /Users/bb/Pictures localhost/ backup /Users/bb/Projects localhost/ backup /Users/bb/Projects-Archive localhost/
Instead of
localhost
, you can use remote machines as well. Checkman rsync
for detailsMake sure it works and create initial backup
rsnapshot -c /usr/local/etc/rsnapshot.conf hourly
The first backup will take a while, but subsequent backups will be fast. A normal backup on my machine takes about two minutes and runs unnoticeably in the background.
Write launchd Agent
Next, we have to tell macOS to run the backups in regular intervals. Conceptually, you do this by writing a launchd agent script[3], which tells launchd when and how to run your backups. In my case, I create four files in
/Users/bb/Library/LaunchAgents/
, calledrsnapshot.{hourly,daily,weekly,monthly}.plist
. Apple's documentation for these files is only mildly useful (as usual), butman launchd.plist
andman plist
should give you an idea how this works.Here is my hourly launchd agent (I'll explain the bash/sleep thing later):
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"> <plist version="1.0"> <dict> <key>Label</key> <string>rsnapshot.hourly</string> <key>ProgramArguments</key> <array> <string>/bin/bash</string> <string>-c</string> <string>sleep 0 && /usr/local/bin/rsnapshot -c /usr/local/etc/rsnapshot.conf hourly</string> </array> <key>StartCalendarInterval</key> <dict> <key>Minute</key> <integer>0</integer> </dict> </dict> </plist>
For the other four scripts, change the two occurrences of
hourly
to{daily,weekly,monthly}
and change the<dict>
portion at the end todaily:
<key>Minute</key> <integer>0</integer> <key>Hour</key> <integer>0</integer>
weekly:
<key>Minute</key> <integer>0</integer> <key>Hour</key> <integer>0</integer> <key>Weekday</key> <integer>1</integer>
monthly:
<key>Minute</key> <integer>0</integer> <key>Hour</key> <integer>0</integer> <key>Day</key> <integer>1</integer>
However,
rsnapshot
can only ever run one backup at a time without stepping on its own toes. This is a problem when the computer wakes up, and more than one backup was scheduled during its sleep, since launchd will then happily launch all missed backups at the same time. But only one of them will succeed.To fix this, I delay the later backup tiers using the
sleep 0
directive. I usesleep 900
(15 minutes later) for daily,sleep 1800
(30 minutes), andsleep 2700
(45 minutes) for the lower tiers[4]. It seems that there should be a more elegant solution than this, but I haven't found one.From the documentation, you might think that
<key>Program</key>
would be more succinct than supplying the binary as the first argument of<key>ProgramArguments</key>
, but this apparently uses a different syntax and does not in fact work as expected.Load launchd agents
launchctl load ~/Library/LaunchAgents/rsnapshot.*
Test launchd agent
launchctl start rsnapshot.hourly
If it doesn't work, Console.app might show a relevant error message.
Remove backup directory from Spotlight
Go to System Preferences → Spotlight → Privacy → Add your
snapshot_root
directory from earlierDisable TimeMachine and delete your existing backup (if you want)
Start Time Machine, right-click any directory you want to delete, and select "delete all backups of $dir"
[1] rsync is one of those reliable tools I talked about. It is rock solid, incredibly versatile, and unapologetically single-minded. A true gem!
[2] This works great for local backups. If you need encrypted backups or compressed backups (maybe on an untrusted remote machine), this post recommends Borg instead of rsnapshot, but you will lose the simplicity of simple directories.
[3] I use launchd instead of cron since launchd will re-schedule missed backups if the computer was asleep.
[4] This will fail if the hourly backup takes longer than 15 minutes. This is rather unlikely, though, or at least should not happen often enough to be of concern.
Caveats
The configuration file of rsnapshot says that you might experience data corruption if you run several copies of rsnapshot at the same time (and you can use the lockfile to prevent this). This is a problem if your computer is asleep while rsnapshot is scheduled to run, since launchd will then re-schedule all missed tasks at once when the computer wakes up. If you enable the lockfile, only one of them will run.
On the other hand, only the hourly task will actually create a new backup. All higher-level backup tiers merely copy existing backups around, so in theory, they shouldn't step on each other's toes when run concurrently. I have opened an issue asking about this.
There are other possible solutions: ① You could modify the launchd entry such that backups only trigger after a few minutes or, better yet, only once all other instances of rsnapshot have finished. I am not sure if launchd supports this, though. ② You could schedule the hourly task using cron instead of launchd, since cron will not reschedule missed tasks. This would only work for two tiers of backups, though. ③ You could just ignore the issue and hope for the best. After all, if a daily or hourly backup gets corrupted every now and then, you still have enough working backups…
Dropbox deleted my pictures and Time Machine didn't backup
Dropbox deleted some of my favorite photos. Have you looked at all your old pictures lately and checked if they are still there? I have, and they were not. Of course Dropbox denies it is their fault, but no other program routinely accessed my pictures. I am not alone with this problem. It must have happened some time between the summer of 2015, when I put my pictures on Dropbox, and the summer of 2016, when Time Machine last corrupted its backups and had to start over, thereby deleting my last chance of recovering my pictures. The pictures are gone for good.
So, what have I learned? Dropbox loses your data, and Time Machine can't restore it. These programs are obviously no good for backups. Let me repeat this: Dropbox and Time Machine are not a backup! A true backup needs to be reliable, keep an infinite history, and never, never, never accidentally delete files.
From now on, I will use rsnapshot for backups. Here's a tutorial on how to set it up on a Mac. I have used rsnapshot for years at work, and it has never let me down. For syncronizing things between computers, I now use syncthing. Both of these programs are not as user-friendly as Dropbox or Time Machine, but that is a small price to pay for a working backup.
A few years ago, I had high hopes that Apple and Dropbox and Google and Amazon would lead us to a bright future of computers that "just work", and could solve our daily chores ever more conveniently and reliably. But I was proven wrong. So. Many. Times. It seems that for-profit software inevitably becomes less dependable as it adds ever more features to attract ever more users. In contrast, free software can focus on incremental improvements and steadily increasing reliability.
Using a Raspberry Pi as a Time Capsule for Mountain Lion
A while ago, I bought a Time Capsule to take care of my backups. I can't say it has been smooth sailing. Every now and then, the Time Capsule would claim that the backup did fail. Sometimes a reboot would help, sometimes not. Sometimes hdiutil
would be able to salvage the backups, sometimes not. Sometimes, the backup disk image would simply be corrupted and the only option would be to delete it and start over.
This might be bad luck or it might be due to a defective Time Capsule or it might be due to my computer. I have no idea. But the thing is, if I have to hack on my backup system anyway, lets do it in style, at least. So here goes:
Ingredients: A Raspberry Pi, an external hard drive, some patience
Format an SD card as described in the wiki. I just installed the version of Debian that is provided on the official website. Now just boot up.
Next, I was stumped because I only have an Apple LED display and no convenient way of connecting the Raspberry Pi's HDMI output to the LED display's Mini Display Port. After some searching and a combination of three adapter cables, I finally got it connected and could see it boot. Really, I have no use whatsoever for the HDMI port on the Raspberry pi. So the first thing I did was to enable SSH, which luckily is available right there in the configuration utility that starts when you boot the thing for the first time.
After that, I disconnected the display and immediately was stumped because I now had no way of finding the Pi's IP address. Actually, I did not even have a network to connect it to. So I strung an ethernet cable from my laptop to the Pi and enabled Internet Sharing in order to (1) start the DHCP server and (2) give the Pi internet access. The IP address was then easily found using arp -a
.
Setting up the hard drive
First off, I needed to format and mount my external hard drive to be usable as a Time Machine volume. ls /dev
showed the hard drive as /dev/sda
. Thus, I installed parted
using sudo apt-get install parted
and used it sudo parted
. In parted, select /dev/sda
sets it up to modify the external hard drive, rm 1
deleted its main partition, q
to quit parted. Next, creating a new partition: sudo fdisk
, then in there n
with p
and 1
to create a new primary partition, then w
to apply the changes and exit. Lastly, I created the file system with sudo mkfs -t ext4 /dev/sda1
with the whole partition as its size. Now lastly, I created a mount point for it using mkdir ~/TimeMachine
(don't use sudo
!) and auto-mounted it by appending this to /etc/fstab
/dev/sda1 /home/pi/TimeMachine ext4 rw,auto,user,exec,sync 0 0
Note: sync
specifies that all file system changes have to be written to disk immediately, without caching. This might be bad for performance, but on the other hand, this behavior is probably a good idea for a backup system. I once read something somewhere that Apple is enforcing a similar behavior on their Time Capsules and that this is the reason why they won't allow any other network drive as Time Capsules.
Setting up the shared folder
First up, this requires netatalk
, so I did sudo apt-get update
and sudo apt-get install netatalk
to install it. Next, netatalk has to be configured to actually share the drive on the network. This is accomplished by appending this line to /etc/netatalk/AppleVolumes.default:
/home/pi/TimeMachine TimeMachine allow:pi cnidscheme:dbd options:upriv,usedot,tm
Also, the afp daemon should be configured to use the proper authentification schemes. Thus, add this to /etc/netatalk/afpd.conf:
- -transall -uamlist uams_randnum.so,uams_dhx.so,uams_dhx2.so -nosavepassword -advertise_ssh
(maybe append
mdns
to the hosts in /etc/nsswitch.conf? Probably not necessary.)
I am also not quite sure whether I actually had to create a new file /etc/avahi/services/afpd.service and write into it:
<?xml version="1.0" standalone='no'?><!--*-nxml-*--> <!DOCTYPE service-group SYSTEM "avahi-service.dtd"> <service-group> <name replace-wildcards="yes">%h</name> <service> <type>_afpovertcp._tcp</type> <port>548</port> </service> <service> <type>_device-info._tcp</type> <port>0</port> <txt-record>model=Xserve</txt-record> </service> </service-group>
And maybe, you have to create an empty file that signifies the drive as Time Machine compatible using
touch ~/TimeMachine/.com.apple.timemachine.supported
.
Edit: Turns out, all these were not necessary. Thank you, reader Philipp, for trying them out!
I certainly did all that, but I am not quite sure which of these steps were strictly necessary. If you know, please let me, too.
Anyway, with all that done, restart both the netatalk and the Bonjour daemon using sudo /etc/init.d/netatalk restart
and sudo /etc/init.d/avahi-daemon restart
.
Setting up the Time Machine
Now, back to the Mac. In order to make Time Machine accept the new network share, run
defaults write com.apple.systempreferences TMShowUnsupportedNetworkVolumes 1
Edit: Turns out, this setting is not necessary. OSX just picks the Raspberry Pi as a usable Time Machine drive by default.
Finally, the TimeMachine folder on the Raspberry Pi was available as one of the backup drives. Halleluja!
Now transfer speeds for the initial backup are not exactly what I would call fast, but this might not be the Pi's fault. For one thing, the Pi is reporting to only run at half load. For another thing, the external hard drive and its USB connection is probably not very speedy. And lastly, I seem to remember that initial backups always were slow. But really, only time will tell how well this thing can do the job of a Time Capsule.
Further testing shows that transfer speeds are very comparable to the Time Capsule. Thus, I declare this a raging success!
This article heavily steals from these fine folks on the internet: