The Business Continuity Lessons from September 11

Comments

Business continuity is about the enterprise's ability to respond effectively to the unexpected - whatever it is.

Within minutes of the first plane crashing into the World Trade Centre in New York City's financial district on September 11, 2001, more than 200 organisations began declaring disasters with their business continuity and disaster recovery service providers. Hundreds more put their recovery plans in motion using internal resources.

But recovery for some capabilities, facilities and people is still under way eight months after the attacks. Some businesses still operate in cramped, temporary quarters and will continue to do so until permanent space is ready. Some individuals still relive the experience and suffer from the physical and psychological effects.

The research group of Gartner's CIO Community, Executive Programs (EXP), worked with seven organisations in New York and Washington who were affected by the attacks. In addition, we analysed data from the 230 organisations that participated in the joint GartnerEXP and Society for Information Management (SIM) Business Continuity Readiness Survey conducted post September 11. Combined with our research from other sources, these results provide a global picture of how organisations responded and the lessons learned.

Examine all aspects of your business continuity plan. The first finding is that all aspects of your business continuity plans, from the assumptions upon which they are based to their rehearsal frequency, need to be re-examined in light of the events of September 11.

Start with planning to protect your human assets. You need to always know where your people are. Then you need to get people back to work as soon as they are ready, while seeking to minimise staff burnout.

Manage your productivity losses. A predictable environment is a key requirement for high productivity, and predictability is the first thing to go in a disaster. One of the lessons learned is that disruptions continue long after a disaster has occurred and initial recovery is completed.

For example, some organisations had to relocate several times during recovery before they got to a new permanent location. Relocating, cramped temporary quarters and poor system performance disrupt productivity.

Rehearse, document, and build resilience. In the September 11 crisis, many organisations experienced problems with their Plan A procedures and did not have a Plan B. Options need to be rehearsed with alternative participants to protect the enterprise in the unfortunate event of the loss of key leadership.

Do not rely on people alone to maintain your organisation's knowledge of critical processes and people. Spread the knowledge and document it.

Extend your recovery-planning window. When very bad things happen, it may take a lot longer than the three to five days you planned for restoring your critical business functions and returning to your primary location. Your primary location may no longer exist or access to it may be denied for weeks.

Closing off most of Lower Manhattan as a restricted zone, while achieving control, hampered some business recovery efforts. People couldn't return to their offices to salvage critical equipment, data, or documents. We heard numerous stories about police being begged to turn their backs to allow someone to rescue a critical server or backup tapes.

Of the EXP/SIM survey participants who declared disasters and occupied recovery sites, the average stay was 28 days and the maximum was 70 days.

Build a resilient IT infrastructure that includes the network, the distributed systems, and processing capacity.

Increase network survivability. Restoring telecommunications services took longer than expected due to the massive scale of destruction. As one interviewee put it, "We built a self-healing network. But the infrastructure under the network wasn't self-healing."

Ensure that your telecommunications providers and other key suppliers have adequate business continuity plans. Ask them about their key points of failure, which scenarios they've considered, and what steps they've taken to prepare for those scenarios.

Eliminate single points of failure in your network design. Diversify the carriers, technologies, and circuit paths you rely on. Organisations in Lower Manhattan learned the hard way that multiple carriers shared one hub. When it was destroyed 30 per cent of telecommunications traffic was destroyed with it.

Improve distributed computing backup. Many of the backup processes and techniques developed for mainframes are now available for distributed systems.

Rethink capacity requirements. Many organisations approached disaster recovery planning with the idea that minimum configurations would suffice until the recovery was well under way. But our interviewees experienced surges of Web site hits, e-mails, phone calls, and customer enquiries and transactions right after the event occurred. They had to handle peak loads using degraded capacity.

Understand how to manage a crisis

Two critical crisis management roles are coordinating recovery efforts, and communicating internally and externally.

Provide central coordination, not micromanagement. The CIOs we interviewed concurred that their people made the right decisions about what to do with systems resources during the crisis much faster than if the CIO had tried to micromanage every decision. One CIO said she learned quickly that her role was to negotiate for her people with senior management to get the resources they needed fast, not to tell them what to do.

Micromanagement is a symptom of managerial stress. When the world is out of control, people want to control anything they can. CIOs, who often must be control-oriented to manage their high-volume production shops, should therefore be alert to any tendencies to over-control decisions during recovery.

Keep your employees well informed and address the rumour mill. Rumours flourish when real information is in short supply. The simplest antidote, therefore, is frequent communication of useful facts. Be proactive; don't wait until you're asked the status of your people, the company, and the recovery. Be alert to questions employees ask, and get them the answers using all available channels. Make sure that reliable sources of external information are available as well. To keep his staff informed, one CIO set up TV monitors running Cable News Network (CNN) 24x7 in his company's offsite data centre and in the main office cafeteria.

Maintain consistent external communications with stakeholders outside the company too. Policies need to specify who is authorised to speak for the company during a crisis. Spokespeople should be given media training and practice delivering their messages during business continuity plan rehearsals. Others should be instructed to defer to these selected spokespeople.

If there's any doubt about how to conduct crisis communications to the public, planning, and perhaps execution, should be handled by a public relations firm with expertise and experience in crisis communications.

Strengthen business continuity planning governanceOne of the more difficult challenges cited was restoring business processes, personnel and facilities. The message is clear: demonstrate board and executive support and funding for business continuity planning improvements now, while it is still "front of mind". Five steps can strengthen the governance of your business continuity planning process.

The first is to increase senior management participation. The executive in charge of business continuity planning should be at the CXO level because CXO-level executives will ultimately be responsible for the consequences of planning failures - even if they're not involved in the planning. In many organisations, senior departmental managers were responsible for business continuity planning for their departments. The result was a disjointed effort that lacked the enterprisewide view.

Step two is to appoint a business continuity manager. The tragedy of the "commons" (town square) is that everyone is responsible for maintaining the commons, and therefore no one is. The result is that the "commons" inevitably degrades. Business continuity, one such common, needs a business continuity manager as the single point of responsibility and coordination so it gets done.

Step three is to make business continuity planning a joint business/IS effort. Business continuity is more than recovering IT assets, which has been the traditional focus of disaster recovery planning. It's about recovering business processes and the people that keep the processes running. Only the business people understand their processes and their resource requirements well enough to make the required planning trade-offs and decisions.

Step four is to ensure formal business continuity status reporting. Regular reporting ensures that issues surface and are resolved. In the absence of regular reporting, everyone will assume that someone else has handled any problems. It also keeps the pressure on the staff to make progress and keeps business continuity planning from falling to the bottom of the priority queue.

Finally, integrate business continuity planning with other internal processes. To safeguard and restore business processes effectively, business continuity plans need to be coordinated with the IT system development life cycle, the enterprise's real estate strategy, insurance and risk management, and operations strategies and plans.

The next disaster will be different, but there will be one . . .

Generals, so it's said, are always fighting the last war. All of us tend to think of the next threat in the same terms as the last one. But the next business continuity crisis is less likely to involve planes used as bombs than something new, different, and unexpected. The source may be natural, like an earthquake or a hurricane. Or it may be man-made. It may be unintentional, like an industrial accident or a derailed train, or malicious, like a biological or cyber attack. The next attack may be with weapons of mass disruption, rather than mass destruction.

Business continuity is ultimately about the enterprise's ability to respond effectively to the unexpected. The organisations that anticipated and rehearsed disaster recovery and business continuity scenarios were able to manage a scenario on Sept. 11 that was well beyond their worst expectations.

Dr Marianne Broadbent is group vice president and global head of research for Gartner's Executive Programs