Self-hosting enables us to take back control over our data, but you might end up shooting yourself in the foot if you’re not careful. Things happen: hardware failures, house fires, floods, tornadoes, earthquakes, you name it. We don’t like to think about the bad stuff but people who’re serious about self-hosting should really try to think like sysadmins and it means that data redundancy should be one of your biggest concerns.
Storing data on a single device is risky and the simplest way to mitigate the risk of hardware failure is to buy an additional “backup” HDD or SSD, depending on the amount of data you need to back up and the amount of money in your wallet. A more advanced user can automate this routine with a simple script and by setting up a local NAS. Such a scheme can protect you from an accidental storage device failure but some events, such as house fire or tornado, can destroy both of your storage devices simultaneously and that’s why we also need “location redundancy”.
Storing your data in a separate location is not hard, but it comes with a different set of concerns. With a remote backup, you better have a good internet connection, and you should also trust the remote side, at least to a certain degree.
So, here comes the need to minimize that trust. Of course, it’s better to choose a reliable data storage provider and most of them won’t lose your data because they’re professionals, and they know how to handle your data, but they can snoop on it, or they can be hacked so your data might end up in the wrong hands. That’s the real risk and that’s why it’s important not to store your backups in plaintext, anything that leaves your house should be encrypted in order to make it unreadable by anyone except yourself.
That’s quite a long introduction, I generally like to be verbose, it helps me to systematize my thoughts. So, apparently, there is a need for a remote backups which are unreadable by a remote storage provider. Is there a tool for that?
Here comes duplicity. This tool can do incremental backups, and it also encrypts the data before syncing it with the remote servers. It supports many storage providers, including S3-based offerings such as this nice and shiny bunker 25 meters below Paris. I played with this tool for a few days, and I really like how it works.