29 Dec 2017

# How to set up rsnapshot instead of Time Machine

(This blog post was changed since my initial strategy of disabling the lockfile didn't work. Turns out, the lockfile is required, and backups have to be stacked.)

Yesterday, I wrote about how Time Machine has failed me. Time Machine keeps regular backups, going back as far as your hard drive space permits. In theory. In practice, every year or so it messes up somehow and has to start over, thereby deleting all your older backups. A backup that is not reliable is not a backup.

Luckily, there are alternatives. Probably the easiest is rsync[1], a very cool tool that copies files and directories from one place to another. You could simply run this once a day, and have a new backup every day. You can even configure rsync so it doesn't need to copy unchanged files, and instead hard-links them from an older backup. rsnapshot automates this process to keep a number of tiered copies, for example ten hourly backups, seven daily backups, four weekly backups, and a hundred monthly backups. Each backup is then simply a directory that contains your files. No fancy starfield-GUI, but utterly reliable and trivial to understand [2].

Setting up rsnapshot on macOS is not quite as straight-forward as I'd like, and I couldn't find a great guide online. So, without further ado, here's how to configure rsnapshot on macOS:

• Install rsnapshot

brew install rsnapshot


• Write the config file

You can copy a template from homebrew:

cp /usr/local/Cellar/rsnapshot/1.4.2/etc/rsnapshot.conf.default /usr/local/etc/rsnapshot.conf



And then configure the new configuration file to your liking (preserve the tabs!):

  config_version	1.2 # default
verbose		2   # default
loglevel	3   # default

# this is where your backups are stored:
snapshot_root	/Volumes/BBackup/Backups.rsnapshot/ # make sure this is writeable
# prevent accidental backup corruption:
lockfile	/Users/bb/.rsnapshot.pid
# use this if you back up to an external drive:
no_create_root	1   # don't back up if the external drive is not connected

# configure how many tiers of backups are created:
retain	hourly	10
retain	daily	7   # dailies will only be created once 10 hourlies exist
retain	weekly	4   # weeklies will only be created once 7 dailies exist
retain	monthly	100 # monthlies will only be created once 4 weeklies exist

# the list of directories you want to back up:
backup	/Users/bb/Documents		localhost/
backup	/Users/bb/eBooks		localhost/
backup	/Users/bb/Movies		localhost/
backup	/Users/bb/Music		localhost/
backup	/Users/bb/Pictures		localhost/
backup	/Users/bb/Projects		localhost/
backup	/Users/bb/Projects-Archive		localhost/


Instead of localhost, you can use remote machines as well. Check man rsync for details

• Make sure it works and create initial backup

rsnapshot -c /usr/local/etc/rsnapshot.conf hourly



The first backup will take a while, but subsequent backups will be fast. A normal backup on my machine takes about two minutes and runs unnoticeably in the background.

• Write launchd Agent

Next, we have to tell macOS to run the backups in regular intervals. Conceptually, you do this by writing a launchd agent script[3], which tells launchd when and how to run your backups. In my case, I create four files in /Users/bb/Library/LaunchAgents/, called rsnapshot.{hourly,daily,weekly,monthly}.plist. Apple's documentation for these files is only mildly useful (as usual), but man launchd.plist and man plist should give you an idea how this works.

Here is my hourly launchd agent (I'll explain the bash/sleep thing later):

  <?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>rsnapshot.hourly</string>
<key>ProgramArguments</key>
<array>
<string>/bin/bash</string>
<string>-c</string>
<string>sleep 0 && /usr/local/bin/rsnapshot -c /usr/local/etc/rsnapshot.conf hourly</string>
</array>
<key>StartCalendarInterval</key>
<dict>
<key>Minute</key>
<integer>0</integer>
</dict>
</dict>
</plist>


For the other four scripts, change the two occurrences of hourly to {daily,weekly,monthly} and change the <dict> portion at the end to

• daily:

    <key>Minute</key>
<integer>0</integer>
<key>Hour</key>
<integer>0</integer>

• weekly:

    <key>Minute</key>
<integer>0</integer>
<key>Hour</key>
<integer>0</integer>
<key>Weekday</key>
<integer>1</integer>

• monthly:

    <key>Minute</key>
<integer>0</integer>
<key>Hour</key>
<integer>0</integer>
<key>Day</key>
<integer>1</integer>


However, rsnapshot can only ever run one backup at a time without stepping on its own toes. This is a problem when the computer wakes up, and more than one backup was scheduled during its sleep, since launchd will then happily launch all missed backups at the same time. But only one of them will succeed.

To fix this, I delay the later backup tiers using the sleep 0 directive. I use sleep 900 (15 minutes later) for daily, sleep 1800 (30 minutes), and sleep 2700 (45 minutes) for the lower tiers[4]. It seems that there should be a more elegant solution than this, but I haven't found one.

From the documentation, you might think that <key>Program</key> would be more succinct than supplying the binary as the first argument of <key>ProgramArguments</key>, but this apparently uses a different syntax and does not in fact work as expected.

launchctl load ~/Library/LaunchAgents/rsnapshot.*


• Test launchd agent

launchctl start rsnapshot.hourly



If it doesn't work, Console.app might show a relevant error message.

• Remove backup directory from Spotlight

Go to System Preferences → Spotlight → Privacy → Add your snapshot_root directory from earlier

• Disable TimeMachine and delete your existing backup (if you want)

Start Time Machine, right-click any directory you want to delete, and select "delete all backups of \$dir"

[1] rsync is one of those reliable tools I talked about. It is rock solid, incredibly versatile, and unapologetically single-minded. A true gem!

[2] This works great for local backups. If you need encrypted backups or compressed backups (maybe on an untrusted remote machine), this post recommends Borg instead of rsnapshot, but you will lose the simplicity of simple directories.

[3] I use launchd instead of cron since launchd will re-schedule missed backups if the computer was asleep.

[4] This will fail if the hourly backup takes longer than 15 minutes. This is rather unlikely, though, or at least should not happen often enough to be of concern.

## Caveats

The configuration file of rsnapshot says that you might experience data corruption if you run several copies of rsnapshot at the same time (and you can use the lockfile to prevent this). This is a problem if your computer is asleep while rsnapshot is scheduled to run, since launchd will then re-schedule all missed tasks at once when the computer wakes up. If you enable the lockfile, only one of them will run.

On the other hand, only the hourly task will actually create a new backup. All higher-level backup tiers merely copy existing backups around, so in theory, they shouldn't step on each other's toes when run concurrently. I have opened an issue asking about this.

There are other possible solutions: ① You could modify the launchd entry such that backups only trigger after a few minutes or, better yet, only once all other instances of rsnapshot have finished. I am not sure if launchd supports this, though. ② You could schedule the hourly task using cron instead of launchd, since cron will not reschedule missed tasks. This would only work for two tiers of backups, though. ③ You could just ignore the issue and hope for the best. After all, if a daily or hourly backup gets corrupted every now and then, you still have enough working backups…