Disaster Recovery: It's More Than a Plan - It's a Process
Apr 16, 2011 5:00 AM PT
Disaster recovery is a term often used in Information Technology (IT) circles to describe the necessity for backup technology systems to safeguard an organization's data.
While this type of safeguard is absolutely a necessity to protect valuable data, and also reduce the amount of time your organization will need to recover from an incident, a true disaster plan goes far beyond backup servers and drives.
For example, when disaster strikes, what plan does your organization have in place to communicate, both internally and externally? How will your organization ensure it can get your personnel to your hot site? An IT disaster recovery plan is only a small part of an overall business cntinuity plan (BCP), continuity of operations plan (COOP), or comprehensive emergency management plan (CEMP).
The most important aspects of an effective disaster recovery is planning and training, which both need to be done far ahead of the event. The planning process is more important than the plan itself.
Key Considerations in Disaster Planning
Every organization, at some point in time, will face a disaster, whether it's a power outage, data center meltdown or a major hurricane.
In order to efficiently respond, there are some key processes that must be implemented throughout an organization.
Develop a set of procedures for each disaster scenario. Some key questions to ask: What is the chain of command and lines of succession? How is evacuation handled, if necessary? What about accountability for our employees, on-site vendors and visitors? What's the fallback procedure? If communications are completely down, what actions can be taken without any authorization from management? What are the automatic triggers to act?
Each set of procedures should include:
- Methods of contact and communication. This should include multiple methods for the entire enterprise.
- Determine the chain of command.
- Designate disaster authorities.
- List of your alternative works spaces including primary base of command/operations.
- Data backup: This should include a detailed description on how data is backed up normally (day-to-day) and where it can be found and "turned on" for the company in an event of an emergency. In addition, there should be emergency back-up instructions in the case of evacuation or the like.
- What/where vital equipment and supplies are to keep running your business. What are the mission critical functions of the organization?
Appoint a planning group comprised of the stakeholder departments and programs within your organization to take ownership of the plan and its components. There should also be an operational component that will have personnel responsible for each aspect of the plan when the plan is activated.
This component also needs redundancy built in. For example, designate several personnel who will oversee employee communication, and several that are responsible for customer communication. When developing your plan, consider your organization's vulnerabilities, which can include location, security threats, etc. Your disaster plan should include:
- Redundancy: This is extremely important to data centers as they contain the lifeblood of many organizations. This means that there are backup systems and an alternative means of communications.
- Evacuation plans: The most important resource to any organization is its people and their knowledge. It's important to get everyone out of harm's way and working in a safe environment. Make certain that your key business functions can/will be moved with answers as to where and/or how.
- How do people keep working? It is imperative your disaster plan speaks to how people can access information or services in the wake of a disaster.
- Communications: A vital element in any disaster/contingency plan is communications. As with any issue that arises in an enterprise, communicating with the organization, customers, prospects, partners and other audiences about what is happening, why it's happening and when they can expect to be back "online" will help stem general irritation, rumors and the like. Having a secure portal for employees can allow a company to communicate effectively and efficiently to employees to convey expectations i.e. report/don't report to work/work remotely/report to designated offsite location(s).
Testing and Trying
Having a solid plan in place before disaster strikes is only the first step in proper preparedness. In order for it to work during the event, it needs to be tested for prior to an event. Testing must have the participation of the entire organization.
Testing should also be done frequently to address plan, system, personnel and organizational changes. An effective testing program should be done at least semiannually. The challenge with this portion of preparedness is not only finding the time to effectively test the plan, but also motivating employees to participate fully on top of their day to day work demands. It is recommended that an education and communications program include real-life scenarios. There is no reason why the program can't incorporate some fun into it as well, so it's seen not as an obligation, but as a break from work!
Aside from testing systems, "Rehearsals" or role-playing should be part of the program as well. As part of this program, there are announced rehearsals. However, there are also unannounced exercises/drills that should always be promoted as a test of the plan. You want to find gaps and shortfalls in your plan so that when a real disaster occurs you are better able to deal with the incident, even if the exact scenario does not unfold. If the exercise is announced, care should be taken to keep the scenario from the players, and have some unplanned events occur during the exercise (i.e. physical, emotional or psychological unavailability of key personnel/leadership).
Disaster planning is really about the process and less about the technology for disaster recovery. The plan should serve as the framework and general direction to follow but, as previously noted, cannot take into account every scenario or contingency. There is no way to train for all disasters.
However, having a strong, all-hazards based core plan with policies and standard operating procedures (SOPs) to guide management and employees during a disaster will ensure your organization's survival. The disaster plan does not tell you how to do your job, but rather how to do your job in a compressed timeframe, under stress, and possibly without all your organizations resources in place.