Backups on Linux

I started taking backups seriously when I worked for Colorado Memory Systems (later HP Memory Systems) back in the late Pliocene. Colorado sold QIC tape drives and the software to support them. The software ran on MS-DOS and several 80386 based Unixes. I have taken backups seriously ever since, which has saved my data more than once.

Occasionally the question of backups comes up. "I'd like to start making backups What do you recommend?"

The answers to that depend on a lot of factors, including:

  • What do you want to back up?

  • How often?

  • How long do you want to keep your data?

  • What is your main concern?

    • Drive reliability

    • Recovering from errors

    • Business continuity

    Which is to say, what happens if something really nasty hits your business? This verges into disaster recovery. You might seek advice from your accountant and your attorney.

    • Convenience

    • Something completely different

At the moment, I use six different backup programs, for different purposes.

And I also use the git version control system, which is a sort of a backup environment.

What do you want to back up?

Some stuff isn't worth the disk space to back up because you can regenerate it or re-install it just as easily. Caches, such as a web proxy's. Executables you can re-install, like your office suite.

It is absolutely essential to back up whatever you generate, that no-one else has copies of, and that you don't want to be without. E.g.: document source, software source. Your company's tax and payroll data. The family photograph collection. And something that isn't all that obvious: configuration files. Restoring configuration files is much easier than trying to rebuild them from scratch.

And that leads to a whole art within backup planning: bare metal recovery. I've written on that elsewhere. It is essential for good business continuity planning.

When considering business continuity, one question is, how long can I afford to be without my data? In the case of payroll data, most countries set hard limits on how long you can go without paying your employees. Payroll has to be up and running quickly after a disaster.

How often do you want to back it up?

The answer to this depends on the type of data under discussion.

  • Some data can be ignored entirely.

  • Some data should be backed up daily, except possibly on weekends. This group includes anything important enough to copy to your off-site backup. I use Amanda for this.

  • Every few hours I take a snapshot of certain key data, and save that locally, either on the same hard drive or on a local server. This is aimed at easy recovery from mistakes, like accidentally deleting a file. It is useless for disaster recovery because it's on the same hard drive or another machine nearby. I use rsnapshot here.

  • Near-instantaneous backups. This category is data synchronized between two computers used for the same purposes. For example, syncing your working files between your desktop and your laptop. I use syncthing for this.

  • At will backups. These are unscheduled backups. The best example of this is backing my laptop to my desktop while I am on the road. As time and available networking permit, I manually run a script that calls unison to back up my projects and emails over ssh from the laptop to a desktop.

    Another at will backup is a roughly weekly script based on rsynch for key project data and configuration data on the laptop to a USB stick.

  • Offsite backups. These are rsynch copies of the Amanda backups, and key Amanda metadata to several removable USB hard drives. I swap those to my off-site location weekly, and take one with me when I travel.

What is your main concern?

These are not mutually exclusive.

  • Drive reliability

    Both hard and solid state drives do fail, as does anything else. To prepare for this, I use Amanda. The hard drive Amanda's virtual tapes are on can also die, so I keep my off-site backups current.

  • Recovering from errors

    I use rsnapshot here. Syncthing and unison also help here.

  • Business continuity

    Amanda and then an rsync based script to my off-site backup drives. I also use the laptop's USB stick for this. I also use my bare metal scripts for this.

  • Convenience

    Having a syncthing process running in the background to continuously sync between my laptop and my desktop helps enormously when moving between the two.

    Obviously I only switch between the two when I am at home. I don't need the instantaneous synchronization while I am on the road. So I shut syncthing down when I am on the road, and use unison to back up manually to the desktop.

How long do you want to keep your data?

This varies greatly. Most data need be kept for only a few months. The major exception to this is business data. Depending on local business rules and regulations, you may want an annual snapshot of business data to keep for at least seven years. When I was in business, I did this by swapping out my hard drive early in January. Now I simply replace one of my offsite drives each January (with a larger one). I am contemplating an annual snapshot from the rsnapshot archive of 1 January to BluRay.

blogroll

social