How to set up a simple backup system using Duply/Duplicity

This guide will show how to set up a simple backup system using Duply, that creates encrypted incremental backups of client data and stores them both on a central backup server and on a secondary offsite server. All backups are encrypted by the clients using GPG so that no data could be compromised if someone gain access to the backup servers. The guide assume that Debian is used,  but the instructions should work on any Linux distribution with only minor changes.

Duply backup archtecture
Duply backup archtecture

The guide provide step-by-step instructions for setting up a backup server, a secondary server, and a single client. The following names are used to identify them.

  • Client: client.example.com
  • Backup server: server.example.com
  • Offsite server: offsite.example.com

Duply/Duplicity

Duply is a front end to the Duplicity backup application that give you a more user friendly interface and simplifies common tasks. Duplicity isn’t the most powerful backup application available, but it is a very good choice when you don’t need the functionality of, for example, Amanda or Bacula. The strength of Duplicity is its simplicity and ease of use, particularly when combined with Duply. You can, with a few commands and two simple configuration files, create a complete backup schedule with client side encryption.

Another advantage of using Duplicity is that everything is handled by the client. All you need from the server is the ability to store files on it. Other, more advanced, backup solutions typically use some form of backup and scheduling daemon on the server. With Duplicity it is enough if you have remote SSH access to an account on the server; everything else is handled client side. This is particularly useful when backing up desktops and laptops that are not always online.

There are of course disadvantages to using Duplicity. Since Duplicity doesn’t use a central server for managing backups it will not scale well. Each client is responsible for what to back up and when to back it up, and must be configured and managed separately. This makes Duplicity an excellent choice for small networks where there are only a handful of machines to back up, but for larger networks you may want to use something else.

Step 1: Install Duply (client.example.com)

First install Duply and Duplicity on the client.

# apt-get install duply

It is possible to do everything in this guide with only Duplicity. Duply simply provides an interface for Duplicity that is easier to work with. If, for some reason, you don’t want to use Duply you could instead use Duplicity directly.

Step 2: Add a backup folder on the server (server.example.com)

You need an account on the server to store backups on. You could use a single account for backing up several client machines, but I would recommend that you create separate accounts for each client and only give them a restricted shell. This will limit the potential damage that could be done by compromised or misbehaving clients.

Step 2.1: Install RSSH

If you have not done so already, first install RSSH. RSSH provides a restricted shell that only allow a user to perform actions needed for SFTP, SCP and rsync.

# apt-get install rssh

By default RSSH will not allow rsync to be used. We will need this later, so edit ‘/etc/rssh’ and enable rsync access.

# set the log facility.  "LOG_USER" and "user" are equivalent.
logfacility = LOG_USER
 
allowscp
allowsftp
allowrsync
 
# set the default umask
umask = 027

Step 2.2: Create a client user account

Next you need to create a user account and backup folder for the client. How you do this is up to you and your particular needs (and paranoia). I will give two examples of how you could set this up. By simply storing backups in the home directory of the user, and a slightly more complex method that should work better if you are backing up several clients.

Simple solution

A simple solution is to give each client a user account on the server and let each client store their backups in their home directories. This is easy to set up, but if you wish to copy all backups to a secondary offsite server your will have to either do this as root, or give the offsite server access to each client account separately.

When creating client user accounts on the server I recommend disabling password based login so that only key based SSH authentication is allowed, and to only give clients restricted shell access to the server.

# adduser backup-client --disabled-password --shell /usr/bin/rssh

For this example I’m using the ‘~/backup’ folder for storing backups.

# mkdir -p /home/backup-client/backup/
# chown backup-client:backup-client /home/backup-client/backup/

A warning should be made here. If you have more than one user on the server and you are using the default umask, then other users will (probably) have read access to your backups. This should not be a problem if the backups are encrypted, otherwise you may want to at the very least change the umask, or use the advanced solution.

Advanced solution

A somewhat more advanced solution is to give each client a separate user account on the server, but store all backups in a central backup folder and use ACLs to give a single user (in this case backup-duply) read access to all backups. This requires a few more steps, but makes it possible to copy all backups to a secondary server using only the one backup account.

First create the backup account.

# adduser backup-duply --disabled-password --shell /usr/bin/rssh

Create a backup folder.

# mkdir /var/backup

Use ACLs to give backup-duply read access to the backup folder and all subfolders. This will be inherited by new folders and backups when they are created.

# setfacl -R -d -m user:backup-duply:r-X /var/backup
# setfacl -R -m user:backup-duply:r-X /var/backups

Optionally, use ACLs to override the default umask and set more restrictive read permissions.

# setfacl  -R -d -m other::--- /var/backup
# setfacl  -R -m other::--- /var/backup/*

Create a user account for the client.

# adduser backup-client --disabled-password --shell /usr/bin/rssh

Then create a folder in the backup directory where the client can store backups.

# mkdir -p /var/backup/backup-client/
# chown backup-client:backup-client /var/backup/backup-client/

Step 3: Set up SSH access

You will need to give the client SSH access to the new user account on the server.

If the client does not already have an SSH key, you must first create one.

$ ssh-keygen

Save it in the default location (~/.ssh/id_rsa). You could also create a separate set of keys for Duply, but that will not be covered by this guide.  Since the key will be used non-interactively, you should give it an empty passphrase. This will create two keys. The private key id_rsa, and the public key id_rsa.pub. You must always keep the private key secret!

The next step is to add the public key to the list of authorized keys on the client’s user account on the server.

On the server, create the file ‘/home/backup-client/.ssh/authorized_keys’ if it does not already exist.

# mkdir /home/backup-client/.ssh/
# touch /home/backup-client/.ssh/authorized_keys
# chown -R backup-client:backup-client /home/backup-client/.ssh/

Next, copy the content of the client’s public key (id_rsa.pub) to the authorized_keys file on the server.

Finally, you must add the server’s public key (/etc/ssh/ssh_host_rsa_key.pub) to the client’s list of known hosts (~/.ssh/known_hosts). You can do this manually or, alternatively, if you try to connect to the server it will offer to do this for you. Adding the key manually is more secure and is the recommended way to do this if you do not trust the network in-between the client and the server. Until the keys are in place and the server and client can authenticate each other it is possible to perform a man-in-the-middle attack.

To test that everything is working, try to connect to the server.

$ ssh backup-client@backup.example.com

It should give you the following message.

This account is restricted by rssh.
Allowed commands: scp sftp rsync

Step 4: Create GPG keys (client.example.com)

If you want to encrypt or sign your backups you will need a GPG key. If you do not already have a key the next step will be to create one.

To create a GPG key use the command:

$ gpg --gen-key

Then choose:

(1) RSA and RSA (default)

You can use the defaults for the rest if you wish, but choose a strong passphrase. I recommend you use some form of password generator to create it. Make a copy of the passphrase and store it somewhere safe. You should also create a backup of ‘~/.gnupg/’. You will not be able to decrypt your backups if you lose them.

In the end you should get something like:

$ gpg --list-keys
/root/.gnupg/pubring.gpg
------------------------
pub   2048R/C8B4OI9S 2014-05-16
uid                  Your Name <Yourname@example.com>
sub   2048R/JF5PWSHP 2014-05-16

Here, C8B4OI9S is the name of the main signing key, and we have one subkey JF5PWSHP for encryption. These keys are enough to create encrypted and signed backups.

Step 5: Create a backup profile (client.example.com)

It is now time to set up a Duply backup profile. The following command will create a new profile named “backup_all”.

$ duply backup_all create

This will create a folder ‘~/.duply/backup_all/’, where all relevant configuration files are stored. The backup schedule and other options are listed in ‘~/.duply/backup_all/conf’. The following is a sample configuration file using the GPG keys created earlier.

#GPG encryption key
GPG_KEYS_ENC='JF5PWSHP'
 
#GPG signing key
GPG_KEY_SIGN='C8B4OI9S'
 
#GPG passprase
GPG_PW='YOUR_PASSPHRASE'
GPG_PW_SIGN='YOUR_PASSPHRASE'
 
#Compress backups using bzip
GPG_OPTS="--compress-algo=bzip2 --bzip2-compress-level=9"
 
#Send backups to server.example.com using sftp over SSH. Uncomment the one matching your setup.
 
#Simple solution. Store backups in ~/backup
#TARGET='sftp://backup-client@server.example.com/backup/'
 
#Advanced solution. Store backups in /var/backup/
#TARGET='sftp://backup-client@server.example.com//var/backup/backup-client/'
 
# base directory to back up
SOURCE='/'
 
# Time frame for old backups to keep, Used for the "purge" command.
# see duplicity man page, chapter TIME_FORMATS)
# defaults to 1M, if not set
MAX_AGE=6M
 
# Number of full backups to keep. Used for the "purge-full" command.
# See duplicity man page, action "remove-all-but-n-full".
# defaults to 1, if not set
MAX_FULL_BACKUPS=3
 
# activates duplicity --full-if-older-than option (since duplicity v0.4.4.RC3)
# forces a full backup if last full backup reaches a specified age, for the
# format of MAX_FULLBKP_AGE see duplicity man page, chapter TIME_FORMATS
# Uncomment the following two lines to enable this setting.
MAX_FULLBKP_AGE=1M
DUPL_PARAMS="$DUPL_PARAMS --full-if-older-than $MAX_FULLBKP_AGE "
 
# verbosity of output (error 0, warning 1-2, notice 3-4, info 5-8, debug 9)
# default is 4, if not set
VERBOSITY=5
 
# temporary file space. at least the size of the biggest file in backup
# for a successful restoration process. (default is '/tmp', if not set)
TEMP_DIR=/tmp
 
# more duplicity command line options can be added in the following way
# don't forget to leave a separating space char at the end
DUPL_PARAMS="$DUPL_PARAMS --extra-clean --num-retries 1"

Next, you will have to create an exclude file ‘~/.duply/backup_all/exclude’. This file tells Duplicity which files to back up, and which to ignore.

The following is a sample exclude file that will back up all files in ‘/etc’, ‘/home’ and ‘/var/www’, except for temporary files.

- **/*~
+ /etc
+ /home
+ /var/www
- **

The format is very simple. Duplicity will check each file against the rules in this file, starting from the top, until it finds a rule that matches. If the rule is preceded by a ‘+’ the file will be backed up, and if it is preceded by a ‘-‘ it will be ignored. For example, the file ‘/etc/resolv.conf’ will match the rule ‘+ /etc’ and be backed up, while ‘/etc/resolv.conf~’ will match the rule ‘- **/*~’ and be ignored. The last rule ‘- **’ tells duplicity to ignore all files that didn’t match an earlier rule.

To test that everything is working, run:

$ duply backup_all backup

If this finishes without errors then all that is left to do is to create a cron job. You should also make a backup of ‘~/.duply/’. You will need these configurations files when restoring backups.

Run ‘crontab -e’ and add the line:

@daily /usr/bin/duply backup_all backup_cleanup_purge --force > /dev/null

Step 6: Copy backups to an offsite server (offsite.example.com)

You should always have at least one additional copy of all backups stored offsite. A simple way to this is to use rsync to copy all backup data to a secondary server. This can be done in several ways depending on which solution you chose in step 2, and on your own requirements. I will give two examples where the offsite server connects to and pulls the backup data from the primary server. You can also do the opposite and let the primary server connect to the offsite server and push data to it.

Simple solution

If you followed the earlier steps for the simple solution you need to give the offsite server SSH access to each client account on the primary server. Assuming that the previous steps have been taken to give the client SSH access to the server, all you need do is copy the contents of ‘~/.ssh/id_rsa.pub’ from the offsite server to the authorized_keys files in each client account on the primary server.

Next, add the following cron job to the offsite server for each client account.

@daily /usr/bin/rsync -a --delete backup-client@server.example.com:/home/backup-client/backup/ /your/backup/folder/backup-client/

Before adding this line to crontab you should try the command with ‘–dry-run -v’ enabled to see that it does what you intended. You may also want to leave out ‘–delete’ if you don’t want old backups to be removed from the offsite server once they have been deleted from the primary server.

This solution is quite simple to implement when you only have one or two machines backed up to separate user accounts on the server, but it quickly becomes impractical since you will have to remember to update crontab each time you add or remove a client. This could also create a potential security risk since the offsite server will have write permissions for the backups on the primary server. If the offsite server were to be compromised it could potentially delete backups from the primary server.

Advanced solution

If you followed the earlier steps for the advanced solution you need to give the offsite server SSH access to the backup-duply account. This is done in the same way as in the simple solution described above.

Next, add the following cron job to the offsite server.

@daily /usr/bin/rsync -a --delete backup-duply@server.example.com:/var/backup/ /your/backup/folder/

Before adding this line to crontab you should try the command with ‘–dry-run -v’ enabled to see that it does what you intended. You may also want to leave out ‘–delete’ if you don’t want old backups to be removed from the offsite server once they have been deleted from the primary server.

This solution will create a copy of all backups for all clients with a single command. It also only gives the offsite server read permission for the backups on the primary server. Even if the offsite server were to be compromised the backups on the primary server would be safe.

Design a site like this with WordPress.com
Get started