Over 90% of businesses that suffer a critical data loss event will be out of business within a year. This statistic has been repeated and re-verified so much that it’s become a cliché within the computing world. But despite the fact that we all know better, few companies actually take the time to test their backups on a regular basis.
Why is this? Could it be that they don’t feel a sense of urgency? Or could it be that people simply aren’t aware of the many ways backups can fail? Just for the sake of clearing up any confusion, I thought it would be important to highlight some of the key reasons that you should test your backups at least once per year.
Physical failure of backups is probably the first thing that most of us think of when it comes to backup tests. Any physical device – whether tape, solid state or disk – has a chance of breaking or failing. And ALL physical storage devices will eventually break, given a long enough time span.
Within this category, you should also consider the possibility that your backups were never successful in the first place. For example, we’ve seen many small businesses that have been backing up their own servers for years… only to suddenly discover that they’ve never actually backed up their databases.
One final point you should consider is how you would react if the facility storing your backups were destroyed. This could be the destruction of a remote storage facility, or even the destruction of your main site. (Many companies keep their backups in a fire safe within the same building as their servers. This is a BAD IDEA)
Technology changes incredibly fast, and the rate of this change is also accelerating. If you don’t test and revise your backup process on a regular basis, you’re working on the assumption that your technology infrastructure will never change.
Within the datacenter, new servers are constantly being added, modified and removed. And thanks to virtualization, it’s now incredibly easy to install and provision new servers within minutes. Forgetting to back up one of these machines could be a serious disaster in the making.
Meanwhile, new laptops, desktops, tablets and mobile devices are constantly being added to the IT infrastructure, and these computers are constantly having new software installed. Forgetting to back up one of these machines or neglecting to back up an important folder, can have serious consequences.
Finally, the network itself is changing. Thanks to telecommuting and cloud computing, users and machines are constantly moving in and out of the private networks. Our ideas of what constitute an “internal IT infrastructure” are currently changing…. and our backup processes need to keep up with this.
There are a number of costs associated with data protection:
- The electronic discovery costs associated with locating files.
- The storage costs associated with growing data.
- The costs of maintaining secondary emergency datacenters.
- The costs associated with downtime.
- Maintenance and licensing costs.
- The costs of shipping and storing backup tapes off-site.
If you’re going to be paying for backups anyways, you might as well make sure you’re doing it in the most cost-effective way possible. For example, you can reduce overall storage by using compression and deduplication, and place low-value data onto more cost-effective storage.
Also, you can assigned priorities to systems in order to make smarter decisions about their backup processes. Highly critical systems – such as databases – should be replicated to minimize the possibility of downtime, while low-priority systems can be backed up using less costly processes that provide slower recovery times.
A well-organized backup process is one that ensures that the most important data can be recovered very quickly, with the least amount of data loss. A disaster is no time for improvisation. You want to make sure that all of your backup and recovery procedures are organized and well-practiced.
Let’s suppose your IT manager decided to quit tomorrow. Would you know how to recover your backups in an emergency? Human knowledge can be an important single-point-of-failure that needs to be addressed.
All backup and recovery procedures should be documented and practiced. There should always be at least 2 people who are experts at the emergency recovery of critical systems. Also, you should keep in mind that not all recovery scenarios are the same:
- What if you had to find just one file amongst several terabytes?
- What if you had to recover a file that was generated by a program that is no longer sold?
- What if you had to recover a single server?
- What if you had to recover an entire datacenter?
- What if you had to recover an entire Exchange mailbox?
- What if you had to recover data that had been lost 2 months ago?
There is a wide variety of different possible scenarios, and new ones will pop up with every backup test. You need to be ready for all of them, and have each well-documented.