The Difference Between Data Backup and Disaster Recovery – An Overview
Multiple types of data backup exist, from the traditional approach using tapes, to plugging a hard drive into your server or backing up to the cloud. A data backup will duplicate the data you have in the event that what you have on site is corrupted or infected, in short no longer accessible. Today, data backup is also commonly used as an approach for retrieving files or folders users have deleted or updated in error.
Disaster recovery is quite different to back up and is the exercise of ensuring your organisation is able to function in the event a disaster befalls it. That disaster could be a force majeure event or something more prevalent today such as the loss of infrastructure due to malicious cyber-attack. To recover from a disaster, you need to have a fully documented plan with the restoration of your IT services at the centre of it. This plan will include a full asset list with each asset categorised detailing the impact losing the asset will have on the business. The key part of the plan is describing how those assets are recovered, how long it takes to recover them whist taking into account budgetary constraints.
RTO and RPO Explained – Why do we need to think about it?
RTO: Restore Time Objective is as its name suggests the length of time it will take for your IT systems to be restored to a fully operational state. Understanding the impact loss of assets have on your business will determine the RTO and whether that RTO can be lengthy for non-revenue generating applications to potentially near real time RTO’s for revenue generating applications. There is no one RTO for all business and each business may have different restore time objectives.
RPO: Restore Point Objective defines the point in time to which your business needs to be restored in order for it to continue to function effectively following recovery from a disaster. Many SME’s backup their data overnight which as a result your RPO will only ever be to the night before when the backup was taken which means that if your disaster occurs at 4.30pm in the afternoon you may have lost a whole day’s work. Can you recover from this? It is impossible to have a real time backup of your data but the distance back in time available is directly correlated to the cost of your backup strategy. If you need an RPO of 30 minutes you will need a backup strategy that is imaging your information almost at the same time you are adding and editing it.
Common Assets that May Need Protection – What do I need to protect?
In this section we will look at some of the key assets an organisation may wish to include in their disaster recovery plan. By listing the assets even from a general perspective, it is easy to start to get an idea of how assets could be categorised and whether they are important to the ongoing functioning of your business.
Phone System: Phone systems today are very different to those in the past. VOIP systems are often cloud based and even if your office phone system is fully disabled, most people have mobile phones. The introduction of softphones leads to a solution where as long as you have internet access and a PC or app then your phone can function as though you were using a physical handset in the office. The criticality of your phone system is dictated by your business and is often a key component of your business continuity plan.
Email: Let’s face it, not having email would bring most businesses to a standstill. How your email is setup becomes a critical component of your disaster recovery plan. Much like phone systems, those on Office 365 just need an internet connection but if you still have an onsite email solution then it is likely Email will be high on the list of assets that need to be restored in almost real time should a disaster occur.
Company Website: The type of business you operate dictates the criticality of your company website. If you are an online retail organisation, then your website is your revenue generating engine and without it your organisation will lose revenue and market share. For an IT support company, for example, the website is a way of sharing information about the disaster, but it is still only an information sharing platform. Most IT services companies can function without a website although you may wish to include a website as part of your communications plan.
Line of Business Applications: Commonly referred to as revenue generating applications and without these up and running your business cannot function. Manufacturing companies may have logic control systems and ERP systems, professional services companies may have practice management system. By listing these assets, you can clearly define what is most important to your business and what needs to get the most attention following an incident.
Accounting Applications: Accounting applications for most organisations are support applications and not revenue generating. Granted that if payroll can’t function and payables and receivables cannot be processed then there may be an opportunity cost for not being able to sort in a timely fashion, however with a good communications plan, you should be able to function without your accounting system for a period of time.
Files and Documents: The first question to ask about files and documents is where they are stored. Are they on the cloud in application such as OneDrive or do they still reside on the physical server in the office or a data centre? Cloud based files and folders will just need an internet connection, but office or data centre-based services will be intrinsically linked to your data backup strategy. How your users work also dictates the criticality of these services, for example if users download, amend and then upload documents, they may unwittingly be adding to your disaster recovery woes because the backed-up version of the document is not the most recent. The most recent version is on the user’s workstation which has just been destroyed in a cyber-attack.
Our Service – Why choose Nimoveri over another company?
At Nimoveri, we believe there are ten steps that go into defining a disaster recovery plan, many of those having been loosely identified in the definitions above. These ten steps will allow you to create a detailed successful disaster recovery plan that clearly identifies what you will do in the event of a disaster. We don’t skip any steps, we follow this multi-stage approach to creating a plan that quickly documents the ten points. Then, over time each point is re-visited and fleshed out in more detail as you progressively build a comprehensive disaster recovery plan. We believe, having a draft, loosely sketched out ten step plan is considerably more effective than having no plan at all.
- Map critical business processes: Create a high-level business process map that looks at all the functions of the business labelling them as support or revenue generating. Once categorised you can assign a criticality to each. For example, procurement may be a supporting function but is of a reasonably high priority, unlike HR which is a supporting function but may be low priority. Sales is a revenue generating function and must be high priority but is it a higher priority than manufacturing that is also revenue generating? Nimoveri have created a Disaster Recovery Process Checklist that will help you to define what is critical and what isn’t.
- Define restore times: Step 1 has created a list for you, prioritised for your business and now you need to define a maximum allowable downtime for each of the business processes. This is where you need to start to think about budget as an RTO of five minutes is going to have a more significant cost implication than an RTO of five days.
- List the applications: This should be done in order of Restore Time Objective.
- Document your data recovery strategy: Closely linked to your backup strategy it is important to understand how you are going to retrieve the data required to get the business back up and running again. Ensure you liaise closely with whoever manages your backup so that you understand restoration duration. Different solutions and approaches have different restoration times.
- Conduct a Business Impact Analysis: Taking each process individually it is now possible to conduct a BIA and to measure the impact of downtime for effected areas of the business. Be honest, it is too easy when conducting a BIA to assume that everything is critical. Your business impact analysis should be an honest look at the process availability and the impact not being able to perform a process has on the cost of downtime. Ensure any legal/compliance requirements are included in your business impact analysis.
- Define Recovery Point Objectives (RPO): With the above information in hand you are now in a position to define how far back in time you can afford your business to go back to and restore from. Your Restore Point Objective. Can you afford to lose 8 hours of a revenue generating application? Does the RPO need to be only a few minutes ago or can it be longer. Define for each of the business processes, critical and non-critical.
- Understand Recovery Time Objective (RTO): For a server rebuild following malware intrusion you could be looking at 2 days to rebuild and redeploy servers and applications. Is this acceptable? Setting a recovery time objective for the critical parts of the business will allow you to estimate how long it takes to get the business back up and running again.
- Mean Time to Restore (MTTR): Closely linked to above and a critical measure of how long a system can be done before it begins to threaten the ongoing existence of the business. Let’s say you have a RPO of yesterday and an RTO of 3 hours, you still need to understand how long it is going to take to restore as this could blow your RTO out of the water. You might not be able to identify MTTR for all possible scenarios, but it is good to get a rough idea for each of your processes.
- Assess Risks: Create a risk assessment chart documenting the points of failure in your process such as data loss. Some examples may be:
a. Data loss
b. Connectivity loss
c. No access to key applications
d. Unable to connect to SaaS applications
e. No inbound phone calls
f. No outbound phone calls
g. Email service impacted
- Redesign, evaluate and implement new systems, processes and practices: Ensure you have walked through each scenario in detail and properly understand the impact to the business of it not being available. The output of this part of the process will be a step by step guide that details the procedures and processes you need to step through in order to get your business back up and running again.
As mentioned above, it is better to have a draft, high level set of procedures in place than no procedures at all so your first requirement is to quickly run through the ten steps. At that point you can then refine and add detail to your plan.