The Many Uses of Rsync
Originally created in 1996, rsync (or “remote sync”) is a versatile file copying utility for unix based systems. For most admins, this is their bread and butter.
Rsync can be used to quickly move large amounts of data to both local and remote destinations. For this reason, rsync is often used to copy data, make backups, migrate hosts, and bridge the gap between site staging and production environments.
We’ll cover a few basic uses of rsync and walk through a few examples.
But first, why rsync?
You may be wondering why you would want to use rsync instead of the simpler ‘cp’ command. Most of the time, cp is perfectly adequate. But there are a few reasons that you should consider using rsync instead of cp. One is that rsync only copies the delta (difference) between the files at the source and destination, possibly saving copious amounts of system resources. Another is that it is possible to compress the data as it’s being copied. These differences are particularly meaningful when making regular backups, copying large sites or applications, and especially when sending that data over a network.
Here are the tools:
- Our OS will be Linux Ubuntu 16.04, though most unix environments (including Mac) support rsync and even have it packaged as part of the software. You can also use one of the several high quality wrappers for Windows.
- You’ll also need to be at least a little familiar with the Linux CLI and (very) basic server administration.
And now our examples:
We’ll start with the most basic rsync format for copying (syncing) files. To get started, open a terminal and execute the following command, replacing the file path info with yours:
rsync -av /path/to/directory1/ path/to/directory2/
This will copy the contents of directory1 into directory2. An important consideration here is the final slash (/) in the file paths of the command. In the example above, the contents of directory1 are copied into directory2, but directory1 will not be created in directory2. To accomplish this, we drop the slash after directory1:
rsync -av path/to/directory1 /path/to/directory2/
The flags:
-a: Copies files recursively and preserves users, groups, symbolic links, file permissions, and timestamps.
-v: As with many other commands, this option asks for verbose output. This is especially useful when copying large amounts of data.
–delete: This flag isn’t used here but it is a common feature of rsync. This option deletes any files or folders in the destination that aren’t at the source. Use with extreme caution!
-h or –help: This prints a help page that has useful information about using rsync.
As you might imagine, rsync has many other useful options, and it’s worth checking them out when you have time.
Combine rsync with cron to create scheduled backups
Cron is a useful tool that could (and should) have its own article, but for now we’ll just cover a couple of basic functions so that you can automate your backups (or just about anything else, if you’re feeling adventurous). If you’d like to learn more about cron, check out this Media Temple community article.
To get started, open the crontab so that you can create a new job.
crontab -e
You may be asked to select an editor by pressing either 1 or 2. Make your selection and scroll to the end of the file. Cron’s syntax can seem confusing at first, but it’s actually fairly straightforward once you get your mind around it.
The scheduling works like this:
* * * * command
The asterisks (*) correspond to specific blocks of time:
Minute (0-59) Hour (0-24) Day (1-7) Month (1-12) Weekday (0-6) command
Use numbers in place of asterisks to dictate when the specified command will run. For instance, let’s assume that you want to schedule your backup to run every Monday at 11:15pm:
15 23 * * 1 rsync -av path/to/directory1 /path/to/directory2/
That’s the 15th minute of the 23rd hour of the first day (Monday) of each week.
Or perhaps you need it to run each evening at 8:30pm:
30 20 * * * rsync -av path/to/directory1 /path/to/directory2/
That’s the 30th minute of the 20th hour of each day.
To help you get the hang of it, there’s a nifty tool located here that breaks the syntax down nicely.
Sending data to a different host
Now that we know how to move data around locally with rsync and create scheduled backups with cron, let’s look at what rsync does really well: Move data across networks. This is very useful for a variety of reasons. Perhaps you need to migrate hosts, or you’d like to create remote backups of files on your computer, or move a website from a staging/testing environment to a public facing web server.
In any case where you need to move large amounts of data across a network, rsync is perfect.
- Remote backups – It’s always a good idea to have external backups of data. This helps protect you from events like hardware failure.
- Migrating Hosts – Have you finally outgrown your current web hosting configuration and are finally ready to move onto something bigger? A common method used to move your site is to download your site’s data and then upload it to the new host using S/FTP. But with a simple rsync command you can move that data directly to the new host, possibly saving large amounts of time.
- Staging > Production – A common practice in web development is to create a staging site where changes are tested before being pushed to the production server. This helps you avoid introducing errors into your website, and can save you a lot of downtime. Rsync is great for pushing this data. Of course, this setup requires two separate server environments, which may not be practical for some. For those WordPress users out there we’ve made this process incredibly easy in our Managed WordPress Hosting service.
To move data from one host to another, use the structure in the command below. Don’t forget to replace ‘user’ with your user, and the ip address with the address of your remote host that you’ll be transferring data to:
rsync -avz path/to/local/directory1 [email protected]:/path/to/remote/directory2/
To download from a remote directory, simply reverse the order:
rsync -avz [email protected]:/path/to/remote/directory1 path/to/local/directory2/
- Unless you’re using ssh keys to connect to the remote server, you’ll be prompted for the remote user’s password. That remote user must also have write permissions for the target directory.
- You may want to add the –delete option when using rsync to make updates to a site. This will remove any files or folders from the destination that are not at the source. Use caution! A misstep could erase needed files.
- The -z option specifies compression during the transfer process. This is useful when sending large amounts of data across the network.
- In the event of a failed transfer, such as a dropped connection or similar, restart your transfer using the –append option. This will allow you to restart the transfer at the spot where the transfer failed. This is especially useful for very large transfers that are much more likely to error.
If you’re using a port other than the default 22 for ssh (a good security practice), you’ll specify that in the command.
rsync -avz ‘ssh -p 1234’ path/to/directory1
[email protected]:/path/to/directory2/
Rsync works very well for migrating data from one to host to another, but if you’re creating remote backups that you won’t need to access on a regular basis, you may want to consider archiving the files using the ‘tar’ command or a similar utility prior to sending them out. This will significantly reduce the amount of storage space and bandwidth used:
tar -zcvf backup1.tar.gz path/to/files/
rsync -avz --remove-source-files path/to/backup1
[email protected]:/path/to/backups/
- The ‘–remove-source-files’ option deletes the local tar file on the host so that you don’t end up with several unneeded local backups.
If you’re feeling a little fancy, you can also easily create dated backups. The tar command below names the compressed file based on the current date and rsync grabs the file and syncs it with your backup server.
tar -zcvf "$(date '+%y-%m-%d').tar.gz" path/to/files/
rsync -avz --remove-source-files "$(date '+%y-%m-%d').
tar.gz" [email protected]:/path/to/backups/
It’s also possible to create automated remote backups over ssh. To do that, generate an ssh key pair that doesn’t require a password to use. If you need help doing this, Media Temple has a helpful article that can have you set up in less than 20 minutes. Once you’ve created your keys, simply add the command to your crontab:
crontab -e
* * * * * rsync -avz path/to/directory1/
[email protected]:/path/to/directory2/
In Conclusion
These examples should get you started with rsync. It’s one of the more dynamic commands (just check out the man page) easily molded for more specific uses. To that point, if you’re an old hand at rsync and have a few tricks of the trade, let us know and we’ll get it added. Happy rsyncing.