Terminal Tip: Scheduling an RSYNC based backup over SSH with a CRON job

Over the last couple of weeks I’ve been struggling with our web server. You may have noticed extremely slow load times from now and again. The majority of this has been caused by an insane traffic boost over the last two months, but some of it has to do with permission problems.

I’ve always been pretty paranoid about my backup strategy locally, but when it came to the server I was a lot more cavalier about my approach. I figured with all the server load problems it would be a good time to double and triple check my backup strategy.

Then it dawned on me—RSYNC over SSH initiated by a CRON job on my Mac Pro would be an extremely geeky solution. Now, if those are foreign terms for you, you may want to abort this post and move on. But, if you know what those things are in theory then carry on.

So, OS X is built on a Unix core, and that means that we have access to all kinds of cool Unix commands. CRON is one of those things that comes built with OS X. You can use CRON to schedule tasks on your system. In my case, I’m using it to remotely connect to the web server and backup our website, as well as other important files, to my Mac Pro at home. My ISP probably hates me, but the load isn’t that bad the second and third time I run RSYNC. In essence, RSYNC syncs your backups. My setup is literally synchronized between the server and my home computer; whatever happens on one happens on the other. It’s for this reason that subsequent backups take less bandwidth and time. The only things being backed up after the first instance is files that have changed at some point since the last backup. If you’re looking for backups, this step by step won’t help you. You’ll need some additional steps.

Setting up RSYNC to work over SSH

The first thing we need to do is make sure that our local machine can connect to the server without a password. Normally when you connect to a server through SSH, you’re prompted for a password. We can bypass that by generating some keys that will reside on the server. You can check out a post on Tech Talk Point that will walk you through those steps.

Once you’re able to log into your server without entering a password, you’re a third of the way there. The first time I set up SSH keys for the server it took me a couple of times to get it right.

Now for the geeky stuff

Here comes the backup command to get RSYNC to connect to your Unix based server over SSH. It looks scary, but we’ll break it down.

rsync --delete -ave ssh username@servername:/location/onserver /location/on/localmachine/

  1. ‘rsync –delete -ave’, this section of the code is essentially telling the server to archive (-a) the files, in a way that shows you on the screen what’s being copied (-v), and that you’re going to grab the information from a url (-e). The ‘– delete’ section of the command is telling the local machine to delete any file that is on the local machine, but not on the remote machine. This is how the syncing occurs. For instance, if I delete a file directly on the server, RSYNC will make sure that the same file is being deleted on my local machine. Note: I wanted exact copies in two locations, that’s why I’m using the –delete operator.
  2. ‘ssh username@servername:/location/onserver /location/on/localmachine/’, this is essentially telling your local computer to connect to your remote server using SSH. ‘username@servername’ would be translated into something like ‘jschnell@123.113.123.123’, so make sure that you have that information handy. Then make sure you have the colon, followed by the direct path to the folder you want to backup. Once you’ve made it this far, you want to hit the spacebar then enter the absolute location on your local computer where you want the files to be backed up.

Here’s an example of the command in its entirety (it should be all on one line):

rsync --delete -ave ssh jschnell@123.123.123.123:/var/www/vhosts/website.com /Users/jschnell/website.com

We’re now two-thirds through the process.

Scheduling a CRON job to carry out the task

Once you’ve got the command all worked out, and you’re able to remotely connect to your server without having to enter a password, you’re able to automate the process with CRON and CRONTAB. Before doing this, we’d recommend running your RSYNC command in the terminal to test it out. Depending on the size of your backup, it may take a while to complete.

  1. Okay, now comes the scheduling. For simplicity’s sake, go to a Crontab code generator website (follow the link) and paste your command into the command box. Now pick what criteria you want. In our setup we have the site backing up hourly. We’ve chosen 45 in the minute column, 4pm in the hour column, and then left the rest set to every day.
  2. Now click the create crontab button. Copy the results (CMD+C) into a text file so you can grab it later.
  3. Open the terminal on your local machine.
  4. Type “crontab -e”.
  5. This will open your scheduled CRON jobs (it should be an empty page, with a bunch of ‘~’ running down the side.)
  6. Push the ‘i’ key. This will put you into “insert” mode so you can paste your line of code from the crontab generator.
  7. Now paste the code from the generator (CMD+V).
  8. Hit the ESC key. This will end the insert mode.
  9. Now type ‘:x’. This will exit the terminal editor and save the file.

You’ve now officially scheduled your RSYNC backup to occur automatically every day at 5:45PM.

BOOM. Backups scheduled.

Some Notes

If you’re not comfortable with the terminal, then you may want to find someone to assist you. We’re only providing this tutorial for your information. If you break something, you’re on your own. Also, you should always back up your system before messing with the terminal in any capacity, especially if you’re new to this line of work.

Also, the Macgasm server backup is running every 30 minutes so that we have a relatively fresh backup of the files on the server. You can increase or decrease the frequency of your back up at your own will. Just do it in the Crontab Code Generator.

Some Additional Reading

  • Read up on SSH
  • Read up on RSYNC
  • Read up on CRONTAB
Joshua is the Content Marketing Manager at BuySellAds. He’s also the founder of Macgasm.net. And since all that doesn’t quite give him enough content to wrangle, he’s also a technology journalist in his spare time, with bylines at PCWorld, Macworld… Full Bio