31 March has been chosen as the date of World Backup Day. The choice of this date isn’t random, but it’s a joke with the meaning we have given to the next day, April 1. This day serves as a reminder that April Fools’ Day can happen any day of the year, and we should be prepared. In this post we will see what a backup is and why it needs to be done. We will also try to understand how to make a good backup in order to secure our data.

What is a backup?

A backup is a second copy of our important files. Of course it’s impossible to give one definition of important that fits all, so take the following list as an example and define your own list of important things.

  • Photos from last holiday
  • The draft of your thesis
  • Emails you still need to keep
  • Work documents
  • Your list of contacts

So a backup is just a copy? Not quite. First of all, this copy must be made on a different device than the one that contains the original data. If you copy the photos of your visit to Pompeii from the folder visit-pompeii to the folder visit-pompeii-copy you have two copies of the photos, but you have only taken up unnecessary disk space. If your PC breaks down, you won’t be able to access one copy or the other.

Another feature that distinguishes a backup from a copy is that we associate a time label with the former. So when we make a backup of photos taken in Pompeii, we will have something like visit-pompeii-backup-2020-09, visit-pompeii-backup-2020-10, visit-pompeii-backup-2020-11, etc., so that we can restore the original copy of our photos to a precise point in time.

Why should I make a backup?

Backups should be done because sometimes things, whether we want them to or not, go wrong. We have to live with that fact and be prepared. Don’t you think? Here are some examples of things that have gone wrong, the effects of which have been exacerbated by the absence of backups.

  • The media that house our data can break or catch fire.
  • Losing a notebook is never a good experience, but if it contains the only copy of your dissertation it becomes a real drama.
  • Very often data is an intangible asset that has a higher value than tangible assets.

I could continue the list with many more examples, but I am sure I have already convinced you.

But isn’t sync with the cloud enough?

Sync data with the cloud is certainly a good thing as far as data preservation is concerned1, but it cannot be considered a real backup. A typical use of cloud sync is to replicate the changes made to the data from one device A to another device B leaving a further copy also on the cloud itself. So, for all we’ve said, we’re definitely making a copy on another device, but we’re not associating any time label to this copy.

What would happen if after a month we realised we had deleted a file by mistake? Is it still possible to retrieve it through the cloud? What if I realise after three months?

Depending on the specific cloud service, it may or may not be possible to recover old versions of a file. The problem is that we don’t decide whether or not to keep the old versions, nor how long to keep them. For this reason a sync service is certainly better than nothing, but it cannot replace a backup.

How often should I make a backup?

Before answering this question, I would like to clarify one aspect; completely preventing data loss is impossible. Our aim is to reduce the probability of data loss as far as possible and, in the event of loss, to minimise the amount of data lost. Having said that, let’s move on to the answer: it depends.

There is no such thing as a one-size-fits-all backup strategy. Everyone has their own needs and will therefore need to perform backups differently. As a rule of thumb, let’s say that the more live and important the data is, the more often it makes sense to make backups. Then how often this actually is depends on you. For some often it might be every week, for others every hour.

Another parameter that depends very much on personal needs is: how long to keep the backups. Again, recommending to keep a backup for six months doesn’t make much sense. Depending on the type of data and our needs, those six months could be either a very long or a very short interval. In this case, a guideline is to have fatter backups for the nearer moments and as they get farther apart to have less fatter backups. In other words, it might make sense to keep one backup for each of the last seven days, one for each of the last four weeks, and one for the last six months.

Sounds very complicated, what if I forget?

You’re bound to forget to make a backup and you’re right, manually managing this process can get you lost, but luckily we have computers and can rely on them. There are many good and, very often, free programs (like Borg) that allow you to automate both the actual backup process and the process of deleting old backups. In this way, once these programs are set up, we can forget about their existence and sleep soundly.

Is there anything else I should know?

Another useful notion is that of the 3-2-1 rule. This backup rule is based on three principles:

  • have at least three copies of your data
  • keep the data on at least two types of media
  • have at least one copy of the data off-site.

The first principle tells us that in addition to the original data we must have at least two other copies. This means that three failures of three different devices must occur at the same time to lose everything. The second principle tells us that we must copy our data onto at least two media based on different technologies. This is because similar devices tend to have a similar lifespan and therefore to break down at the same time. The last principle is to keep at least one copy of our data in a different place than the other two. So for example at your parents/grandparents’ house or in the cloud.

Another important precaution, especially for off-site backups, is to crypt your backups. You never know.

Finally, it is important to check from time to time that your backups are working properly. When you have the time, try to simulate the failure of your hard disk and try to recover the lost data through a backup. Were you successful? Great, that means your backup works!


  1. Here we would open a huge parenthesis regarding privacy, but this is not the right time to open it. ↩︎