Copy and synchronize backup data with RSYNC and public key authentication


Rsync is a fast and versatile command line utility that synchronizes files and folders between two locations over a remote shell, or from/to a remote Rsync daemon. It provides fast incremental file transfer by transferring only the differences between the source and the destination.

In this guide we will explain how to setup and automate an Rsync script that pulls data from the source to the destination.

 

1. Generate Keypair

Generate a public and private key pair on the destination server, the server from where you are pulling your backup data. This is the method that we recommend, the push method has a number of security implications that need to be considered.

ssh-keygen

Enter passphrase (empty for no passphrase):
Enter same passphrase again:

 

2. copy public key to the server that is the source of your data.

ssh-copy-id -i ~/.ssh/id_rsa.pub ip.of.data.source

or you can manually copy and paste the contents of /root/.ssh/id_rsa.pub that you just generated on the destination server to the /root/.ssh/authorized_keys file on the source server. This may come handy if you do not allow password authentication for SSH on your destination server.

 

3. setup rsync script

The example below sets up Rsync to pull data from a remote server. It synchronizes this data with the local folders that have been configured. Additionally a bandwidth limit has been set and a folder named "backup" on the source folder is ignored and thus not synchronized. Finally a backup log file is generated.

rsync -av --delete --bwlimit=50000 --exclude 'backup*/' --log-file=/home/rsync-backup-log-$(date +"%Y-%m-%d").log -e 'ssh -p 22' root@ip.of.data.source:/source/folder/ /destination/folder/

 

4. setup a cron job

navigate to the /etc/cron.d/ folder, create a file named backup-cron and paste the contents of the rsync script that we created earlier and with the cron schedule configured.

00 10,20 * * * root rsync -av --delete --bwlimit 50000 --exclude 'exlude/folder/of/choice*/' --log-file=/home/rsync-backup-log-$(date +"%Y-%m-%d").log -e 'ssh -p 22' root@ip.of.data.source:/source/folder /destination/folder/

Make sure to adjust the schedule to meet your requirements. In the above example the script runs twice a day, once at 10.00 hrs and another run takes place at 20.00 hrs. The cron script runs as the root user.

 

Commonly used rsync flags

-a archive mode; equals -rlptgoD (no -H,-A,-X). Mandatory for backup usage. activates recursion into the folders and preserve all file’s metadata
-c skip based on checksum, not mod-time & size. More trustworthy, but slower. Omit this flag if you want faster backups, but files without changes in modified time or size won't be detected for include in backup.
-h output numbers in a human-readable format.
-v increase verbosity for logging.
-n or –dry-run Rsync provides a method for double-checking your arguments before executing an rsync command. The -v flag (for verbose) is also necessary to get the appropriate output: rsync -anv dir1/ dir2
-R relative will create the same folder structure on the server
-P combines the flags –progress and –partial. The first of these gives you a progress bar for the transfers and the second allows you to resume interrupted transfers
-z compress file data during the transfer. Less data transmitted, but slower. Omit this flag when backup target is a local device or a machine in local network (or when you have a high bandwidth to a remote machine).
--progress show progress per file during transfer. Only for interactive usage.
--timeout set I/O timeout in seconds. If no data is transferred for the specified time, the backup will be aborted.
---delete delete extraneous files from dest dirs. Mandatory for master-slave backup usage.
--link-dest hardlink to files in specified directory when unchanged, to reduce storage usage by duplicated files between backups.
--log-file log what we're doing to the specified file. Example: --log-file=$HOME/public_html/rsynclogs/rsync-backup-log-$(date +"%Y-%m-%d").log
--chmod affect file and/or directory permissions.
--exclude exclude files matching pattern.
--exclude-from same as --exclude, but getting patterns from specified file.
--bwlimit imitss I/O bandwidth. You need to set bandwidth using KBytes per second. For example, limit I/O banwidth to 10000KB/s (9.7MB/s), enter: # rsync --delete --numeric-ids --relative --delete-excluded --bwlimit=10000

 

Used only for remote backups

--no-W ensures that rsync's delta-transfer algorithm is used, so it never transfers whole files if they are present at target. Omit only when you have a high bandwidth to target, backup may be faster.
--partial-dir put a partially transferred file into specified directory, instead of using a hidden file in the original path of transferred file. Mandatory for allow partial transfers and avoid misleads with incomplete/corrupt files.

 

Used only for local backups

-W ignores rsync's delta-transfer algorithm, so it always transfers whole files. When you have a high bandwidth to target (local filesystem or LAN), backup may be faster.

 

Used only for system backups

-A -A: preserve ACLs (implies -p).

 

Used only for log sending

-r recurse into directories
--remove-source-files sender removes synchronized files (non-dir).
  • 4 Users Found This Useful
Was this answer helpful?

Related Articles

What is a dedicated server?

Unlike normal hosting plans, which put many customers' accounts on a single server, a dedicated...

Creating Strong Passwords

Passwords provide the first line of defense against unauthorized access to your computer. The...

How to reboot your dedicated server

  This article will explain how to restart your Dedicated Server.   Note: When possible, its...

How to install a new OS on your Dedicated Server

  This article will explain how to install an Operating System on your Dedicated Server....

Additional IP addresses

  Additional IP addresses can be ordered for the following products only: Enterprise Web...