Scheduled Backups on Ubuntu using Duplicacy and Azure
This page is a guide on how to backup a folder on an Ubuntu machine to a storage account on Azure, using Duplicacy. The data backed up to Azure will be encrypted with an RSA key, and the config encrypted with a password. The backups will be scheduled via cronjob, and will send an email to you on every run (this requires an SMTP server, SendGrid is a decent option). This assumes you already have a machine setup.
1. Create a Storage Account
- Go to the Azure Portal and make a new storage account, doesn’t matter the name/region/resource group etc. Feel free to use Azure’s redundancy features for extra backup security (assuming you trust Azure to not go down, which is a fairly safe assumption).
- Create a container within the storage account, and form the URL
azure://STORAGEACCOUNT/CONTAINERand save this for later. - Get one of the access keys and save it for later.
2. Setup duplicacy
- Grab the latest release of duplicacy, make it executable with
chmod +xand then move it into/usr/local/bin. - Go into the folder you want to backup and run
duplicacy init -e -key public.pem repository_id storage_url.public.pemcan be any public RSA key.repository_idis just the name of the backup repository, choose whatever you like but remember it, as you’ll need it if you ever want to pull down the backup onto another machine.storage_urlis the URL formed in step 1.2.- When you run this command, it will ask for a password because of the
-eflag, this encrypts the config, so don’t lose this password. It will also ask for the access key to the storage account.
- Now you can run
duplicacy backup -statsto do a backup. It will ask for the storage account access key and config password again. To prevent having to enter these every time, edit.duplicacy/preferencesin the current folder and change the"keys"property to look like this:
"keys": {
"azure_key": "STORAGE ACCOUNT KEY HERE",
"password": "CONFIG PASSWORD HERE"
},
- Now you can run
duplicacy backuporduplicacy restorefrom this machine without needing to enter the passwords all the time (note that the passwords are being stored here in plain text, so only do this if you trust no one will compromise the machine).
3. Setup duplicacy-util
duplicacy-util is another executable that helps schedule backups.
- Similarly as before, grab the latest release of duplicacy-util, run
chmod +xon it, and move it to/usr/local/bin. - Make a
.duplicacy-utilfolder somewhere, e.g.~/.duplicacy-util. Go in there and make 2 files:duplicacy-util.yamlandrepository_id.yamlwhererepository_idis the name you used in step 2.2. This name doesn’t really matter but it helps to keep things consistent. - In
duplicacy-util.yaml, enter the following:
notifications:
onStart: []
onSkip: ['email']
onSuccess: ['email']
onFailure: ['email']
email:
fromAddress: "Duplicacy Backup <[email protected]>"
toAddress: "Firstname Lastname <[email protected]>"
serverHostname: smtp.sendgrid.net
serverPort: 587
authUsername: apikey
authPassword: XXX
This assumes you’re using SendGrid, change the email config options as necessary.
4. In repository_id.yaml, enter the following:
repository: /path/to/folder
storage:
- name: default
threads: 1
prune:
- storage: default
keep: "0:365 30:180 7:30 1:7"
threads: 1
check:
- storage: default
repository is the folder you initialized duplicacy in, from step 2.2. Read the duplicacy-util docs on the different settings here. Here’s what the current keep settings mean:
1:7 # Keep a revision per (1) day for revisions older than 7 days
7:30 # Keep a revision every 7 days for revisions older than 30 days
30:180 # Keep a revision every 30 days for revisions older than 180 days
0:360 # Keep no revisions older than 360 days
You can increase the thread count if you have them to improve performance.
5. Finally you need to edit your crontab to run duplicacy-util on some schedule. Run sudo crontab -e to edit the root crontab and add 0 5 * * * /usr/local/bin/duplicacy-util -sd /path/to/.duplicacy-util -f repository_id -a -m -q. This will run the job at 5 AM every day, running backup, pruning, and validation of data, with no output to logs, and will send an email.