Backups, including redundant solutions, are increasingly important as we seek to keep our IT services up-and-running, explains Gary Henderson, ANME member
CREDIT: This is an edited version of an article which appeared on the ANME website.
Backups keep IT services running for our own internal users and also for external users or clients/customers. This might involve taking backup copies of data to tapes, having a redundant firewall or internet connection, or having a cloud-based service available to replicate on-premise services in the event of a disaster.
My concern, however, is that we can feel better for having these solutions in place, happy in the knowledge that we are better off and more protected than if we didn’t have them. The issue is that this sense of additional protection is false. Just having a backup solution of one type or another doesn’t mean that it will work when things go wrong.
We also need to be cognisant of the fact that, when things do go wrong, the result is often one of stress and urgency as we seek to restore services while under pressure from users, business leaders and process owners – among others.
Robust programme of testing
We need to adopt a scientific mindset and test our backup solution to make sure it works as intended. It is much better to test our backup solutions to a timetabled plan, than having the first test of a solution being a full-blown, real-life incident where failure of the system could result in difficulties for the organisation. We also need to bear in mind that, just because it works on the day the solution was put in place – or even works today – it doesn’t mean it will work in a week’s or month’s time…or even in a year’s time, when we truly need it!
We need to have a robust programme of testing of our backup solutions to ensure that they work, that we are aware of how they work and any implications, and that those who need to use them are comfortable with their use. Only by doing this can we be more comfortable in the knowledge that, when something does go wrong, we have a solution in place and are ready to put it to use.
Case in point
The perfect example of the above, for me, was a recent test of our own backup solutions which included a service which indicated that recovery to a redundant system would be complete in four hours and would be based on data backup taken regularly. Upon testing the solution, we found that the four-hour recovery period was exceeded, due to issues with the backup, and the data was three days old. We also found that there were implications for other systems when the test failure occurred.
It might be tempting to look on the above in a wholly negative fashion, focusing on why the solution didn’t work. However, I want to avoid this and intend to focus more on the positive side of things. We now, at least, know the solution didn’t perform as anticipated and we know more about the implications of the tested failure area – we are now more knowledgeable than we were before the test. We will, therefore, now work internally and with the backup solution vendor to arrive at solutions that better meet our needs and are – hopefully – more robust and reliable.
The moral of the story: nothing works until you test it to confirm. So, test your backup provision – and test it often!