Business Continuity Plan

Information Technology Risk Management

  • Reading(s), from Gibson

    • Chapter 12: Mitigating Risk with a Business Impact Analysis

    • Chapter 13: Mitigating Risk with a Business Continuity Plan

  • Review Lecture 1

  • Review Lecture 2

Book:

  • Information systems security & assurance series
  • Jones & Barlett Learning
  • Managing Risk in Information Systems – Darril Gibson-second edition

“Business Continuity Plan” Please respond to the following:

  • Read the Business Continuity Plan for MIT. Then, recommend two (2) additional components to improve the plan. Justify your recommendations-Please see MIT BUSINESS CONTINUITY PLAN BELOW.

 

Please:

  • List all references.
  • Cover page is not needed.
  • Number of pages needed: 4 + 1 full page summary-Total 5 pages
  • Please use additional info (slides script) on this attachment and MIT BUSINESS CONTINUITY PLAN when elaborating on this subject.

 

 

MIT BUSINESS CONTINUITY PLAN

This is an external release of the MIT Business Continuity Plan.

For information on the plan or Business Continuity Planning at MIT, call Jerry Isaacson MIT Information Security Office at (617) 253-1440 or send e-mail to [email protected]

Copyright 1995 Massachusetts Institute of Technology

To Page the BCMT Duty Person:

Duty Person To just leave phone number To leave an 80 character message Number to call back dial: call and give PIN #

1

2

For recorded disaster recovery status reports and announcements during the emergency

call:

Copyright 1995 Massachusetts Institute of Technology

Table of Contents

Part I. Introduction

1Introduction to This Document 1 Part II. Design of the Plan 3

Overview of the Business Continuity Plan 3 Purpose 3

Assumptions 3

Development 4

Maintenance 4

Testing 4

Organization of Disaster Response and Recovery 4 Administrative Computing Steering Committee 4 Business Continuity Management Team 5 Business Continuity Management Team 5

Institute Support Teams: 6 Disaster Response 7

Disaster Detection and Determination 7 Disaster Notification 8

Initiation of the Institute’s Business Continuity Plan 8 Activation of a Designated Hot Site 8

Dissemination of Public Information 9 Disaster Recovery Strategy 9

Scope of the Business Continuity Plan 11

Category I Critical Functions 11 Category II Essential Functions 11 Category III – Necessary Functions 11 Category IV – Desirable Functions 11 Part III. Team Descriptions 12 Institute Support Teams 14

Business Continuity Management Team 14 Damage Assessment/Salvage 15

Campus Police 16

MIT News Office – Public Information 17 Insurance 19

Telecommunications 20

Part IV. Recovery Procedures 21 Notification List 21

To reach the BCMT Duty Person: 22

Business Continuity Management Team Coordinator 25 Damage Assessment/Salvage 26

Salvage Operations 27

Campus Police 28

MIT News Office – Public Information 29 Insurance Team 31

Telecommunications 32

Appendix A – Recovery Facilities 33

Emergency Operations Centers 33

Appendix B – Category I, II & III functions 34 Appendix C – Plan Distribution List 35 Business Continuity Management Team 37 BCMT Duty Person Procedures 38

GUIDE TO BCMT ACTIVATION 39

Part I. Introduction

Part I contains information about this document, which provides the written record of the Massachusetts Institute of Technology Business Continuity Plan.

Introduction to This Document

Planning for the business continuity of MIT in the aftermath of a disaster is a complex task. Preparation for, response to, and recovery from a disaster affecting the administrative functions of the Institute requires the cooperative efforts of many support organizations in partnership with the functional areas supporting the “business” of MIT. This document records the Plan that outlines and coordinates these efforts, reflecting the analyses by representatives from these organizations and by the MIT Information Security Officer, Gerald I. Isaacson.

For use in the event of a disaster, this document identifies the computer recovery facilities (hot sites and shell sites – see Page 33) that have been designated as backups if the functional areas are disabled.

How To Use This Document

Use this document to learn about the issues involved in planning for the continuity of the critical and essential business functions at MIT, as a checklist of preparation tasks, for training personnel, and for recovering from a disaster. This document is divided into four parts, as the table below describes.

Part Contents

  1. Information about the document itself.
  1. Design of the Plan that this document records, including information about the overall structure of business continuity planning at MIT.
  1. General responsibilities of the individual Institute Support Teams that together form the Business Continuity Management Team, emphasizing the function of each team and its preparation responsibilities.
  1. Recovery actions for the Institute Support Teams and important checklists such as the notification list for a disaster and an inventory of resources required for the environment. [Note: If a “disaster” situation arises, Section IV of the Plan is the only section that needs to be referenced. It contains all of the procedures and support information for recovery.]

Audience

This document addresses several groups within the MIT central administration with differing levels and types of responsibilities for business continuity, as follows:

    • Administrative Computing Steering Committee
    • Business Continuity Management Team
    • Institute Support Teams
    • Functional Area Recovery Management (FARM) Teams

It should be emphasized that this document is addressed particularly to the members of the Business Continuity Management Team, since they have the responsibility of preparing for, responding to, and recovering from any disaster that impacts MIT. Part III of this document describes the composition of the Business Continuity Management Team in detail.

Distribution

As the written record of the Institute’s Business Continuity Plan, this document is distributed to each member of the Business Continuity Management Team, including members of the Institute Support Teams.( Appendix C – Distribution List Page -33)

It is also distributed to members of the Administrative Computing Steering Committee, FARM Team Coordinators, Information Systems Directors and others not primarily involved with the direct recover effort..

Part II. Design of the Plan

Part II describes the philosophy of business continuity planning at MIT generally, and the kind of analysis that produced this Plan. It also provides an overview of the functions of the Business Continuity Management Team in implementing this Plan.

Overview of the Business Continuity Plan

Purpose

MIT increasingly depends on computer-supported information processing and telecommunications. This dependency will continue to grow with the trend toward decentralizing information technology to individual organizations within MIT administration and throughout the campus.

The increasing dependency on computers and telecommunications for operational support poses the risk that a lengthy loss of these capabilities could seriously affect the overall performance of the Institute. A risk analysis which was conducted identified several systems as belonging to risk Category I, comprising those functions whose loss could cause a major impact to the Institute within hours. It also categorized a majority of Institute functions as Essential, or Category II – requiring processing support within week(s) of an outage. This risk assessment process will be repeated on a regular basis to ensure that changes to our processing and environment are reflected in recovery planning.

MIT administration recognizes the low probability of severe damage to data processing telecommunications or support services capabilities that support the Institute. Nevertheless, because of the potential impact to MIT, a plan for reducing the risk of damage from a disaster however unlikely is vital. The Institute’s Business Continuity Plan is designed to reduce the risk to an acceptable level by ensuring the restoration of Critical processing within hours, and all essential production (Category II processing) within week(s) of the outage.

The Plan identifies the critical functions of MIT and the resources required to support them. The Plan provides guidelines for ensuring that needed personnel and resources are available for both disaster preparation and response and that the proper steps will be carried out to permit the timely restoration of services.

This Business Continuity Plan specifies the responsibilities of the Business Continuity Management Team, whose mission is to establish Institute level procedures to ensure the continuity of MIT’s business functions. In the event of a disaster affecting any of the functional areas, the Business Continuity Management Team serves as liaison between the functional area(s) affected and other Institute organizations providing major services. These services include the support provided by Physical Plant, security provided by the Campus Police, and public information dissemination handled by the MIT News Office, among others.

Assumptions

The Plan is predicated on the validity of the following three assumptions:

    • The situation that causes the disaster is localized to the data processing facility of Operations and Systems in ; the building or space housing the functional area; or to the communication systems and networks that support the functional area. It is not a general disaster, such as an earthquake or the “Blizzard of ’78,” affecting a major portion of metropolitan Boston.

It should be noted however, that the Plan will still be functional and effective even in an area- wide disaster. Even though the basic priorities for restoration of essential services to the community will normally take precedence over the recovery of an individual organization, the Institute’s Business Continuity Plan can still provide for a more expeditious restoration of our resources for supporting key functions.

    • The Plan is based on the availability of the hot sites or the back-up resources, as described in Part IV. The accessibility of these, or equivalent back-up resources, is a critical requirement.
    • The Plan is a document that reflects the changing environment and requirements of MIT. Therefore, the Plan requires the continued allocation of resources to maintain it and to keep it in a constant state of readiness.

Development

MIT’s Information Security Officer, with assistance from key Institute support areas, is responsible for developing the Institute’s Business Continuity Plan. Development and support of individual FARM Team Plans are the responsibility of the functional area planning for recovery.

Maintenance

Ensuring that the Plan reflects ongoing changes to resources is crucial. This task includes updating the Plan and revising this document to reflect updates; testing the updated Plan; and training personnel. The Business Continuity Management Team Coordinators are responsible for this comprehensive maintenance task.

Quarterly, the Business Continuity Management Team Coordinators ensures that the Plan undergoes a more formal review to confirm the incorporation of all changes since the prior quarter. Annually, the Business Continuity Management Team Coordinators initiates a complete review of the Plan, which could result in major revisions to this document. These revisions will be distributed to all authorized personnel, who exchange their old plans for the newly revised plans. At that time the Coordinators will provide an annual status report on continuity planning to the Administrative Computing Steering Committee.

Testing

Testing the Business Continuity Plan is an essential element of preparedness. Partial tests of individual components and recovery plans of specific FARM Teams will be carried out on a

regular basis. A comprehensive exercise of our continuity capabilities and support by our designated recovery facilities will be performed on an annual basis.

Organization of Disaster Response and Recovery

The organizational backbone of business continuity planning at MIT is the Business Continuity Management Team. In the event of a disaster affecting an MIT organization or its resources, the Business Continuity Management Team will respond in accordance with this Plan and will initiate specific actions for recovery. The Business Continuity Management Team is called into action under the authority of the Administrative Computing Steering Committee which has the responsibility for approving actions regarding Business Continuity Planning at MIT.

Administrative Computing Steering Committee

    • Senior Vice President, Chairman of the Committee. Manages and directs the recovery effort. Provides liaison with senior MIT management for reporting the status of the recovery operation.
    • Vice President for Financial Operations. Provides liaison with the Committee for support of critical business functions affected by the disaster.
    • Vice President for Information Systems. Coordinates all data processing and telecommunications systems recovery, including operational restoration of Building O&S and operations at the designated hot site.
    • Vice President for Research Provides liaison with the Committee for support of critical business functions affected by the disaster.
    • Vice President for Resource Development Provides liaison with the Committee for support of critical business functions affected by the disaster.
    • Executive Vice President Alumni Association Provides liaison with the Committee for support of critical business functions affected by the disaster.
    • Assistant to Provost Provides liaison with the Committee for support of critical business functions affected by the disaster.

Business Continuity Management Team

For the business continuity of MIT systems, two organizations are primary: the Business Continuity Management Team, with its Institute Support Teams, and the Functional Area Recovery Management (FARM) Team for the area affected. In the event of a disaster, the BCMT provides general support, while the FARM Team is concerned with resources and tasks integral to running the specific functional area.

This section provides general information about the organization of recovery efforts and the role of the Business Continuity Management Team. Part III of this document describes the Business Continuity Management Team and the responsibilities of each Institute Support Team in detail.

Business Continuity Management Team.

    • The Business Continuity Management Team is composed of upper-level managers in MIT administration. The following is a list of each position on the Business Continuity Management Team, and a brief overview of each member’s responsibilities:
    • Information Security Officer. As Co-Coordinator of the Business Continuity Management Team, with the Coordinator of the O&S -FARM team, provides liaison between the Institute’s operational and management teams and the FARM teams in affected areas. Also responsible for ongoing maintenance, training and testing of the Institute’s Business Continuity Plan. Coordinates the Institute Support Teams under the auspices of the Business Continuity Management Team.
    • Director, Operations and Systems. Coordinates support for data processing resources at the main data center and the designated recovery sites.
    • Director, Telecommunications Systems. Provides alternate voice and data communications capability in the event normal telecommunication lines and equipment are disrupted by the disaster. Evaluates the requirements and selects appropriate means of backing up the MIT telecommunications network.
    • Chief, Campus Police. Provides for physical security and emergency support to affected areas and for notification mechanisms for problems that are or could be disasters. Extends a security perimeter around the functional area affected by the disaster.
    • Director, Physical Plant. Coordinates all services for the restoration of plumbing, electrical, and other support systems as well as structural integrity. Assesses damage and makes a prognosis for occupancy of the structure affected by the disaster.
    • Director of Insurance and Legal Affairs. Provides liaison to insurance carriers and claims adjusters. Coordinates insurance program with continuity planning programs.
    • Director, MIT News Office. Communicates with the news media, public, staff, faculty, and student body who are not involved in the recovery operation.
    • Personnel Department. Provides support for human resources elements of recovery and staff notification through the emergency broadcast service.
    • Director, Distributed Computing & Network Services. Provides network support for Administrative and Academic Computing and other distributed services and networks.
    • Assistant to the Vice President, for Information Systems. Represents the Office of the President. Liaison to FARM Teams in the President’s Office.
    • Associate Comptroller, Comptroller’s Accounting Office . Represents the Vice President for Financial Operations. Liaison to Financial Operations FARM Teams.
    • Manager, Audit Division. Provides audit support during the emergency. Makes recommendations on changes to the normal control procedures necessitated by the recovery process.
    • Safety Office – Coordinates risk reduction and avoidance activities and emergency response with the BCMT
    • Emergency Response Team – This unit, headed by the Physical Plant Mechanical Engineering Manager, provides the initial response to the majority of campus emergencies.

Institute Support Teams:

Under the overall direction of the Business Continuity Management Team, support is provided to assist a functional area’s recovery by Institute Support Teams. These teams, described below,

work in conjunction with the FARM Team of the area affected by the problem condition to restore services and provide assistance at the Institute level. In many cases, the organizations comprising these support teams have as their normal responsibility the provision of these support services. This support is generally documented in a procedures manual for the organization. The Business Continuity Plan is an adjunct to that documentation and highlights, in particular, the interfaces between the campus level service and the individual FARM Team operations requirements. In cases where the documentation in this Plan and the organization’s documents differ, the organization’s documentation has precedence.

    • · Damage Assessment/Salvage Team. Headed by the Administrative Officer for Physical Plant and activated during the initial stage of an emergency, the team reports directly to the Business Continuity Management Team, evaluates the initial status of the damaged functional area, and estimates both the time to reoccupy the facility and the salvageability of the remaining equipment. This team draws members from the Physical Plant Office, from Operations and Systems, Telecommunications Systems, Distributed Computing & Network Services and from the FARM team of the affected area as well as appropriate vendors supporting our environment.
    • Following the assessment of damage, the team is responsible for salvaging equipment, data and supplies following a disaster; identifying which resources remain; and determining their future utilization in rebuilding the data center and recovery from the disaster. The members of the Damage Assessment Team become the Salvage Team
    • Transportation Team. A temporary Institute Support Team headed jointly by the Computer Operations Manager in Operations and Systems and by the Associate Director of Operations for Physical Plant, responsible for transporting resources personnel, equipment, and materials to back-up sites as necessary. This team draws members from two organizations: Information Systems personnel who normally operate the shuttle bus between and Physical Plant personnel who normally transport heavy equipment within the Institute.
    • Public Information The interface with the media, the general public and faculty, staff and students who are not participating in the recovery process is handled by the MIT News Office, working closely with the Personnel Department.
    • Telecommunications Team Headed by the Director of the Information Systems Telecommunications Department, is responsible for establishing voice and data communications between the affected site and the remainder of the campus.

Disaster Response

This section describes six required responses to a disaster, or to a problem that could evolve into a disaster:

  1. Detect and determine a disaster condition
  1. Notify persons responsible for recovery
  1. Initiate the Institute’s Business Continuity Plan
  2. Activate the designated hot site
  1. Disseminate Public Information
  1. Provide support services to aid recovery

Each subsection below identifies the organization(s) and/or position(s) responsible for each of these six responses.

Disaster Detection and Determination

The detection of an event which could result in a disaster affecting information processing systems at MIT is the responsibility of Physical Plant Operations (PPO), Campus Police, Information Systems, or whoever first discovers or receives information about an emergency situation developing in one of the functional areas , Building other building on campus housing major information processing systems or about the communications lines between these buildings.

Disaster Notification

PPO will follow existing procedures and notify the individuals who are acting as the Business Continuity Management Team Duty Persons (DP)). The DP on call will monitor the evolving situation and, if appropriate, will then notify the Business Continuity Management Team representative based upon a predefined set of notification parameters. (Page – 22)

When a situation occurs that could result interruption of processing of major information processing systems of networks on campus, the following people must be notified:

  • Normally, Physical Plant Operations and /or the Campus Police receive the initial notice through their alarm monitoring capabilities. If the problem does not activate a normal alarm system, immediately notify these two areas.
  • Chairman of the Administrative Computing Steering Committee
  • Vice President for Information Systems
  • The Business Continuity Management Team Coordinator (Information Security Officer)
  • The Operations and Systems FARM Team Coordinator
  • The Telecommunications and Distributed Computing & Network Services FARM Team Coordinators (if the situation affects the data or voice transmission lines or facilities)

Initiation of the Institute’s Business Continuity Plan

Initiation of this Plan is the responsibility of the Business Continuity Management Team Coordinator or any member of the Business Continuity Management Team or the Administrative Computing Steering Committee.

Activation of a Designated Hot Site

The responsibility for activating any of the designated hot sites or back-up resources is delegated to the Vice President for Information Systems. In the absence of the Vice President, responsibility reverts to the Director of Information Systems Operations & Systems or the Coordinator of the O&S Functional Area Recovery Management Team. Within hours of the occurrence, the Vice President for Information Systems, or alternate, determines the prognosis for recovery of the damaged functional area through consultation with the Information Security Officer and the Damage Assessment Team, headed by Physical Plant, which also includes representatives from Operations and Systems, Telecommunications Systems and the functional areas affected.

If the estimated occupancy or recovery of the damaged functional area cannot be accomplished within hours, the usual occupants of the designated back-up site are notified of the intention to occupy their facility.

Dissemination of Public Information

The Director of the MIT News Office is responsible for directing all meetings and discussions with the news media and the public, and in conjunction with the Personnel Department, with MIT personnel not actively participating in the recovery operation. In the absence of the MIT News Office representative, the responsibility reverts to the senior official present at the scene.

Recovery Status Information Number (617) has been established as a voice mail information number for posting recovery status and information notices. All reports will be placed by the Continuity Planning Coordinators or the Telecommunication FARM team leader.

Provision of Support Services to Aid Recovery

During and following a disaster, Institute Support Teams, as described on page 14, are responsible for aiding the FARM Teams. They operate under the direction of the Business Continuity Management Team through the Recovery Coordinator (the Information Security Officer).

Disaster Recovery Strategy

The disaster recovery strategy explained below pertains specifically to a disaster disabling the main data center. This functional area provides mainframe computer and major server support to MIT’s administrative applications. Especially at risk are the critical applications those designated as Category I (see below) systems. The O&S FARM Team Plan provides for recovering the capacity to support these critical applications within hours. Summarizing the provisions of the O&S Plan, subsections below explain the context in which the Institute’s Business Continuity

Plan operates. The Business Continuity Plan complements the strategies for restoring the data processing capabilities normally provided by Operations & Systems.

This section addresses three phases of disaster recovery:

  • Emergency
  • Backup
  • Recovery

Strategies for accomplishing each of these phases are described below. It should be noted that the subsection describing the emergency phase applies equally to a disaster affecting the Adminstration Building or other building on campus, the functional area that provides support for the maintenance of the critical system.

Emergency Phase

The emergency phase begins with the initial response to a disaster. During this phase, the existing emergency plans and procedures of Campus Police and Physical Plant direct efforts to protect life and property, the primary goal of initial response. Security over the area is established as local support services such as the Police and Fire Departments are enlisted through existing mechanisms. The BCMT Duty Person is alerted by pager and begins to monitor the situation.

If the emergency situation appears to affect the main data center (or other critical facility or service), either through damage to data processing or support facilities, or if access to the facility is prohibited, the Duty Person will closely monitor the event, notifying BCMT personnel as required to assist in damage assessment. Once access to the facility is permitted, an assessment of the damage is made to determine the estimated length of the outage. If access to the facility is precluded, then the estimate includes the time until the effect of the disaster on the facility can be evaluated.

If the estimated outage is less than hours, recovery will be initiated under normal Information Systems operational recovery procedures. If the outage is estimated to be longer than hours, then the Duty Person activates the BCMT, which in turn notifies the Chairman of the Administrative Computing Steering Committee and Vice President for Information Systems and the Business Continuity Plan is activated. The recovery process then moves into the back-up phase.

The Business Continuity Management Team remains active until recovery is complete to ensure that the Institute will be ready in the event the situation changes.

Back-up Phase

The back-up phase begins with the initiation of the appropriate FARM Team Plan(s) for outages enduring longer than hours. In the initial stage of the back-up phase, the goal is to resume processing critical applications. Processing will resume either at the main data center or at the designated hot site, depending on the results of the assessment of damage to equipment and the physical structure of the building.

In the back-up phase , the initial hot site must support critical (Category I) applications for up to

weeks and as many Category II applications as resources and time permit. During this period, processing of these systems resumes, possibly in a degraded mode, up to the capacity of the hot site. Within this -week period, the main data center will be returned to full operational status if possible.

However, if the damaged area requires a longer period of reconstruction, then the second stage of back-up commences. During the second stage, a shell facility (a pre-engineered temporary processing facility that we have contracted to use for this purpose) is assembled on the parking lot and equipment installed to provide for processing all applications until a permanent site is ready. See Page 33 for a list of the designated recovery sites.

Recovery Phase

The time required for recovery of the functional area and the eventual restoration of normal processing depends on the damage caused by the disaster. The time frame for recovery can vary from several days to several months. In either case, the recovery process begins immediately after the disaster and takes place in parallel with back-up operations at the designated hot site. The primary goal is to restore normal operations as soon as possible.

Scope of the Business Continuity Plan

The object of this Plan is to restore critical (Category I) systems within hours, and Essential (Category II) systems within week(s) of a disaster that disables any functional area and/or essential equipment supporting the systems or functions in that area.

The initial Risk Assessment of the computer applications that support MIT administration assigned systems to Category I Critical. This risk category identifies applications that have the highest priority and must be restored within hours of a disaster disabling a functional area. Specifically, each function of these systems was evaluated and allocated a place in one of four risk categories, as described below.

Category I – Critical Functions Category II – Essential Functions Category III – Necessary Functions Category IV – Desirable Functions

Note: Category IV functions are important to MIT administrative processing, but due to their nature, the frequency they are run and other factors, they can be suspended for the duration of the emergency.

The administrative systems in Categories I – IV are those that provide Institute wide services. There are many departmental and laboratory systems as well as non-information processing systems (such as ) that are also either essential for the Institute or the local area(s) they support. Recovery for these systems too must be based upon an assessment of the impact of their loss and the cost of their recovery. See the Departmental FARM Team Plan document for further information on assessing risk at the departmental level.

Part III. Team Descriptions

Part III describes the organization and responsibilities of the Business Continuity Management Team. Composed of sub-teams (the Institute Support Teams), the Business Continuity Management Team as a whole plans and implements the responses and recovery actions in the event of a disaster disabling either a functional area, Central Administration or the main data center. It’s primary role is to provide Institute level support services to any functional area affected by the problem.

  • Information Security Officer. As Business Continuity Management Team Co-coordinator, provides liaison between the Institute’s operational and management teams and the FARM teams in affected areas. Also responsible for ongoing maintenance, training and testing of the Business Continuity Plan. Coordinates the Institute Support Teams under the auspices of the Business Continuity Management Team. The Co-coordinator of the BCMT is the Coordinator of the O&S FARM Team, who will take responsibility for recovery in the absence of the Information Security Officer.
  • Director, Operations and Systems. Provides for support for data processing resources with primary responsibility for restoration for O&S processing. Recovery plans for the computing facilities are the responsibility of the Coordinator of the O&S FARM Team and are described in the O&S FARM Team plan
  • Director, Telecommunications Systems. Provides alternate voice and data communications capability in the event normal telecommunication lines and equipment are disrupted by the disaster. Evaluates the requirements and selects appropriate means of backing up the MIT telecommunications network. Recovery plans for the primary 5ESS telephone switching equipment in and satellite facilities in other buildings on campus are described in the Telecommunications FARM Team plan.
  • Chief, Campus Police. Provides for physical security and emergency support to affected areas and for notification mechanisms for problems that are or could be disasters. Extends a security perimeter around the functional area affected by the disaster. Provides coordination with public emergency services (Cambridge Police, etc.) as required.
  • Director, Physical Plant. Coordinates all services for the restoration of plumbing and electrical systems and structural integrity. Assesses damage and makes a prognosis for occupancy of the structure affected by the disaster.
    • Director, Safety Office. Coordinates safety and hazardous materials related issues with other organizations involved in recovery planning and response as well as governmental and other emergency services.

Director, Personnel Department. Coordinates all activities of the recovery process with key attention to the personnel aspects of the situation. This includes releasing staff from areas

affected, initiating emergency notification systems and working with the MIT News office on dissemination of information about the recovery effort

    • Director, Distributed Computing & Network Services. Coordinates all services in support of the restoration of network services and support facilities. This icludes support for Athena communications services and external network service support.
  • Director, MIT News Office. Communicates with the news media, public, staff, faculty, and student body who are not involved in the recovery operation.
  • Assistant to the Vice President, for Information Systems. Represents the Office of the President.
  • Associate Comptroller, Comptroller’s Accounting Office. Represents the Vice President for Financial Operations.
  • Audit Manager, Audit Division Provide consultation on compensating controls and suggestions on maintaining the appropriate level of controls during the recovery process.

Institute Support Teams

Business Continuity Management Team

  1. Function

To oversee the development, maintenance and testing of recovery plans addressing all Category I and II business functions. In the event of a “disaster” to manage the backup and recovery efforts and facilitate the support for key business functions and restoration of normal activities.

  1. Organization

The BCMT is co-chaired by the MIT Information Security Officer and the Coordinator of the O&S FARM Team, who serves in the absence of the Security Officer. The Team is composed of key management personnel from each of the areas involved in the recovery process.

  1. Interfaces

The team interfaces with and is responsible for all business continuity plans and planning personnel at MIT.

Preparation Requirements

On a quarterly basis, the team will meet to review FARM Team plans that have been completed in the last quarter.

On an annual basis, the Team will review the overall status of the recovery plan, and report on this status through the Information Security Officer, to the Administrative Computing Steering Committee.

Individual Team members will prepare recovery procedures for their assigned areas of responsibility at MIT. They will ensure that changes to their procedures are reflected in any interfacing procedures.

The BCMT will ensure that continuing levels of support are available for the FARM Teams that require it.

The BCMT will also review and approve FARM Team plans as they are submitted, re-evaluate the criticality of MIT operating functions at regular intervals and provide for awareness and training in recovery planning. They will also participate in emergency preparedness drills initiated by the Safety Office or other appropriate campus organizations.

Damage Assessment/Salvage

  1. Function

To report to the Business Continuity Management Team (BCMT), within two to four hours after access to the facility is permitted, on the extent of the damage to the affected site, and to make recommendations to the BCMT regarding possible reactivation and/or relocation of data center or user operations. Existing Physical Plant emergency procedures are documented in a manual known as the “Black Book” maintained by Physical Plant. The Business Continuity Plan procedures supplement, and are subordinate to those in the Black Book, which takes precedence in the case of any difference. Following assessment of the damage, the team is then responsible for salvage operations in the area affected.

  1. Organization

Headed by the Administrative Officer for Physical Plant and activated during the initial stage of an emergency, the team reports directly to the Business Continuity Management Team, evaluates the initial status of the damaged functional area, and estimates the time to reoccupy the facility and the salvageability of the remaining equipment. During an emergency situation, the individual designated in the Black Book will take operational responsibility for implementation of damage assessment. This team draws members from the Physical Plant Office, from Operations and Systems, and from the FARM team of the affected area. Following assessment, the team is responsible for salvaging equipment, data, and supplies following a disaster; identifying which resources remain; and determining their future utilization in rebuilding the data center and recovery from the disaster.

  1. Interface

The Damage Assessment/Salvage Team will interface with other Physical Plant operations groups, the Campus Police and Information Systems operations functions, including vendor and

insurance representatives, to keep abreast of new equipment, physical structures, and other factors relating to recovery.

  1. Preparation Requirements

Identification of all equipment to be kept current. A quarterly report will be stored off-site. The listing will show all current information, such as engineering change levels, book value, lessor, etc. Configuration diagrams will also be available. Emergency equipment, including portable lighting, hard hats, boots, portable two-way radios, floor plans and equipment layouts will be maintained by Physical Plant.

A listing of all vendor sales personnel, customer engineers and regional sales and engineering offices is to be kept and reviewed quarterly. Names, addresses and phone numbers (normal, home, and emergency) are also to be kept.

Campus Police

  1. Function

To provide for all facets of a positive security and safety posture, to assure that proper protection and safeguards are afforded all MIT employees and Institute assets at both the damaged and backup sites.

  1. Organization

The team will consist of the Campus Police Department Supervisor and appropriate support staff. The team will report through the Chief who is a member of the Business Continuity Management Team.

  1. Interfaces

The Campus Police Team will interface with the following teams or organizational units, relative to security and safety requirements:

Personnel Physical Plant Safety office

Environmental Medical Services MIT News office

Other appropriate departments as required

  1. Preparation Requirements

Provide emergency medical services, if necessary.

Identify the number of Campus Police personnel needed to provide physical security protection of both the damaged and backup sites.

Identify the type of equipment needed by Campus Police personnel in the performance of their assigned duties.

Coordinate and arrange for additional security equipment and manpower, as applicable, if needed.

Identify and provide security protection required for the transport of confidential information to and from both off-site and backup sites. Coordinate with the appropriate MIT Department.

Periodically review the level of security needed at both the damaged and backup sites.

MIT News Office – Public Information

  1. Function

The most difficult time to maintain good public relations is when there is an accident or emergency. Public relations planning is required so that when an emergency arises, inquiries from the news media, friends and relatives of staff, faculty, and students can be handled effectively. While we cannot expect to turn a bad situation into a good one, we can assist in making sure facts presented to the public are accurate and as positive as possible given the situation.

It is in our best interest to cooperate with the media as much as possible, so that they will not be forced to resort to unreliable sources to get information that could be untrue and more damaging to the Institute than the facts.

Therefore, it is the policy of MIT in time of emergency, to:

Have the MIT News Office serve as the authorized spokesperson for the Institute. All public information must be coordinated and disseminated by their staff.

Refrain from releasing information on personnel casualties until families have been notified. Once families have been notified, names of those personnel should be released quickly to alleviate the fears of relatives of others.

Provide factual information to the press and authorities as quickly as facts have been verified, and use every means of communications available to offset rumors and misstatements.

Avoid speculating on anything that is not positively verified, including cause of accident, damage estimates, losses, etc. (Fire Officials normally release their own damage estimates.)

Emphasize positive steps taken by the Institute to handle the emergency and its effects.

Situations calling for implementation of the Emergency Public Information Plan may include, but are not limited to:

Systems malfunctions disrupting the normal course of operations. Accidents, particularly when personal injury results.

Natural disasters, such as fires, floods, tornadoes and explosions. Civil disorders, such as riots and sabotage.

Executive death.

Scandal, including embezzlement and misuse of funds. Major litigation initiated by or against the Institute.

  1. Organization

The Director of the MIT News Office, a member of the Business Continuity Management Team, will act as the Public Information Officer for the Institute. The News Office alternates are listed in Appendix A. In their absence the responsibility will revert to the Senior Manager on the scene.

  1. Interfaces

The MIT News Office will be the interface between MIT and the public or news media. Copies of all status reports to the Business Continuity Management Team or Administrative Computing Steering Committee will be forwarded to the Public Information Officer for potential value in information distribution for good public relations. They will work with the Personnel Department in dissemination of information to staff.

  1. Preparation Requirements

Existing relationships with local media will be utilized to notify the public of emergency and recovery status. The Public Information Officer will maintain up-to-date contact information for the media and other required parties.

A facility will be identified to be used as a press room. Arrangements will be made to provide the necessary equipment and support services for the press. Coordination with the Telecommunications Team for additional voice communication, if required, will also be made.

Insurance

  1. Function

To provide for all facets of insurance coverage before and after a disaster and to ensure that the recovery action is taken in such a way as to assure a prompt and fair recovery from our insurance carriers.

  1. Organization

The team will consist of the Director of Insurance and Legal Affairs and required staff and insurance carrier personnel. The team reports through the Business Continuity Management Team, of which it is a member.

  1. Interfaces

The Insurance Team will interface with the following teams, relative to insurance matters: MIT News Office

Campus Police

Damage Assessment/Salvage Information Systems Operations Appropriate FARM Teams

This team will be activated upon the initial notification of a disaster.

  1. Preparation Requirements

Determine needs for insurance coverage. Identify the coverage required for both hardware, media, media recovery, liability and extra expense.

Prepare procedure outlining recommended steps to be followed by Damage Assessment/Salvage Team during initial stage of disaster (Appendix A)

List appropriate contacts in (Appendix B).

Arrange for availability of both still and video recording equipment to record the damage.

Ensure that an equipment inventory is available, to include model and serial number of all devices.

Evaluate all new products and services offered by MIT for potential liability in the event of a disaster.

Telecommunications

  1. Function

To provide voice and data communications to support critical functions. Restore damaged lines and equipment.

  1. Organization

The team will consist of appropriate Telecommunications Systems staff. Telecommunications Systems will also coordinate with and supervise outside contractors as necessary. The team will report through the Director of Telecommunications Systems, who is a member of the Business Continuity Management Team.

  1. Interfaces

The Telecommunications Systems team will interface with the following teams or organizational units, relative to telecommunications requirements:

Physical Plant Campus Police

Distributed Computing & Network Services

Other Information Systems departments as necessary

Other MIT departments requiring emergency telecommunications Outside contractors and service providers as necessary

  1. Preparation Requirements

Provide critical voice and data communications services in the event that normal telecommunications lines and equipment are disrupted or relocation of personnel is necessary.

Consult with outside contractors and service providers to ensure that replacement equipment and materials are available for timely delivery and installation.

Utilize available resources, such as the MIT Cable Television network and voice mail system, to broadcast information relevant to the disaster.

Part IV. Recovery Procedures

Notification List

This appendix contains the names and telephone numbers of managers and personnel who must be notified in the event of a disaster. The Business Continuity Management Team Coordinator is responsible for keeping this notification list up-to-date.

Administrative Computing Steering Committee Chairman

Members

Business Continuity Management Team

Two individuals are assigned responsibility for the interface with other campus organizations, such as Physical Plant Operations, to monitor emergencies as they occur. These Early Warning Duty people are then responsible for activation of the full Business Continuity Management Team and necessary Functional Area Recovery Management Teams.

The BCMT Duty People are equipped with Pagers, activated either by Physical Plant Operations or they can be paged directly.

In addition, each Duty Person is equipped with a cellular phone for emergency use.

To reach the BCMT Duty Person:

By Pager:

Duty Person To leave phone number To leave an 80 character text Number call: message call:

and give PIN # of pager 1

2

By Cellular Phone:

1

2

Note: these numbers are to be used only in emergencies or for testing.

The people on duty will monitor the situation and determine if it has the potential to impact our processing ability. [See Duty Person procedure for details]

Coordinators Members

I/S Operations & Systems Telecommunications Campus Police

MIT News Office – Public Information

Insurance Physical Plant:

Emergency Response Team Operations Center

Safety Office

President’s Office

Comptrollers Accounting Office Personnel Office

Distributed Computing & Network Services

BCMT Liason Housing: Nuclear Reactor

Plasma Fusion Lab Medical Department

FARM Team Coordinators Bursar’s Office Category

Financial Planning & Management Category Freshman Admissions Category

Operations & Systems Category Payroll Category

Physical Plant Category Property Office Category Purchasing & Stores Category Registrar’s Office Category Resource Development Category

Technology Licensing Office Category Telecommunications Category

Business Continuity Management Team Coordinator

This appendix contains instructions to the Business Continuity Management Team Coordinators for overseeing disaster response and recovery efforts.

Action Procedures Player Action

Coordinator Ensure entire Business Continuity Management Team (BCMT) has been notified. Then notify Vice President for Information Systems and Chairman of Administrative Computing Steering Committee.

Coordinator Activate the Emergency Operations Center (See Page 33) and notify staff to meet there.

Coordinator Meet with Damage Assessment Team to review their findings and present results to BCMT.

Coordinator Present recommendations to BCMT for next steps in recovery effort.

Coordinator Begin notification of all recovery teams. Check to ensure all recovery participants have been notified.

Coordinator Monitor the activities of the recovery teams. Assist them as required in their recovery efforts.

Coordinator Report to BCMT on a regular basis on the status of recovery activities. Report to Administrative Computing Steering Committee as appropriate on recovery status.

Coordinator On an hourly basis, or other appropriate interval, update the Recovery Status information message on .

Damage Assessment/Salvage

This appendix contains instructions to the Damage Assessment/Salvage Team for disaster response and recovery efforts.

Action Procedures Player Action

Building Services Notify team members, and vendors to report to the site for initial damage assessment and clean-up.

Physical Plant AO Notify insurance representative

Operations Center Issue Work Orders and call appropriate personnel.

Team Leader Request permission to enter site from Fire Department (if required).

Take a service representative from each of the appropriate vendors, the insurance claims representative and appropriate Physical Plant and Information Systems personnel into the site.

Team Members Review and assess the damage to the facility. List all equipment and the extent of damage. List damage to all support systems (power, A/C, fire suppression, communications, etc.).

Team Leader Notify the BCMT as to the severity of the damage and what can potentially be salvaged.

Team Leader Notify the BCMT if the area be restored to the required level of operational capability in the required time frame.

Salvage Operations

Player Action

Team Leader Initiate the Emergency Notification List and have all members report to the Staging Area.

Salvage Team Have the Building Services Supervisor determine which equipment and furniture can be salvaged. Photograph all damaged areas as soon as possible for potential insurance claims.

Salvage Team Important ** Prior to performing any salvage operation contact Insurance Team to coordinate with possible insurance claims requirements and appraisals.

Have the Physical Plant Supervisor and staff start salvaging any furniture and equipment.

Based upon advice from Insurance Team and customer engineering, contact computer hardware refurbishers regarding reconditioning of damaged equipment

Team Leader Meet with the Business Continuity Management Team Coordinator to provide status on salvage operations.

Configuration List

A sample of the configuration and full equipment inventory report from the Fixed Asset Control Systems or other automated equipment inventories should be inserted here. The Continuity Plan Masters in off-site storage will contain the full listing.

Blueprints

Complete sets of blueprints of the buildings housing critical processing and the data center are maintained at [ ] and in off-site storage.

Campus Police

This appendix contains instructions to the Campus Police for disaster response and recovery efforts.

Action Procedures Player Action

Campus Police Duty Sgt. An MIT Police Case Report will be completed upon stabilization of the disaster situation. As per standard police procedure, this report will detail the names of all victims, witnesses, injuries, facility damage description, etc., as well as list all notifications

Campus Police Duty Sgt. Initiate the notification listing of appropriate Campus Police Department Command Staff and personnel (App. A)

Campus Police Day/Night Notify the Business Continuity Management Team if the emergency affects Data Processing or Telecommunications operations in any way.

Campus Police Duty Sgt. Assign Campus Police personnel to both the damaged and backup sites, as required.

Campus Police Duty Sgt. Ensure that all Campus Police personnel are properly equipped at each affected location and the recovery sites. (Page 33)

Campus Police Duty Sgt. Coordinate the need for additional manpower and equipment as required.

Campus Police Command Periodically submit status reports to the Staff Continuity Coordinator at the Emergency Control Center.

Campus Police Command Ensure that all facets of security protection Staff are afforded, relative to entry/exit controls, transportation of information, etc. at both the damaged and backup sites.

MIT News Office – Public Information

Action Procedures Player Action

Campus Police Notify MIT News Office when an emergency occurs.

Public Information Officer Assess the public relations scope of the emergency, in consultation with senior management if necessary, and determine the appropriate public relations course of action.

In instances where media are notified immediately, due to fire department or police involvement, the Public Information Officer will proceed to the scene at once to gather initial facts. Emphasis must be placed upon getting pertinent information to the news media as quickly as possible.

PIO Staff Assistant Maintain a log of all incoming calls to ensure a quick response to media and other requests.

Public Information Officer Maintain a log of all information which has been released to the media.

Public Information Officer When appropriate, prepare news releases on a periodic basis for distribution to the local media list.

Public Information Officer If employee injuries or fatalities are involved, notify Personnel to send appropriate management personnel to the homes of the involved families.

Personnel Notify Public Information Officer as soon as families have been informed. This will permit the release of names and addresses of victims so that families of those not involved can be relieved of anxiety.

Public Information Officer Contact the public relations director(s) at the hospitals where injured have been taken to coordinate the release of information.

Public Information Officer In cases where long-term media coverage is anticipated, establish a Press Room in the ( location to be selected) Provide for telephone requirements of the press.

Public Information Officer Schedule periodic press conferences, taking into consideration Management personnel who will be participating.

Public Information Officer If media wants to photograph physical damage, Clear request with Campus Police prior to approving request. Then accompany all photographers.

Public Information Officer Coordinate follow-up news releases after the immediate emergency has passed to present the Institute in as positive light as possible. Possible topics could include: What has been done to prevent recurrence of this type of emergency?

What are plans for reconstruction?

What has been done to express gratitude to the community for it’s help? What has been done to help employees, students and faculty?

Insurance Team

This appendix contains instructions to the Insurance Team Coordinator for disaster response, salvage and recovery efforts.

Action Procedures Player Action

Insurance Team Leader Contact appropriate Insurance people upon first advice of disaster. Insurance Team Leader Meet with Damage Assessment/Salvage team at site.

Insurance Team Leader Go through disaster scene with Damage Assessment/Salvage team and advise on matters relating to insurance and claims. Ensure that nothing is done to compromise recovery from insurance carrier. Photograph all applicable areas.

Insurance Team Leader File all appropriate claims forms with all involved insurance carriers. Report status of claims activity to the Business Continuity Management Team.

Telecommunications

This appendix contains instructions to the Telecommunications Systems team for disaster response and recovery efforts.

Action Procedures Player Action

HELP Line Personnel or Receives report of disaster from Physical

after-hours Duty Person Plant or Campus Police and notifies appropriate telecommunications Systems and other personnel.

Director, Telecommunications Systems Oversees assessment of damage to telecommunications facilities. Directs contingency and recovery efforts. Provides updates to Business Continuity Management Team and MIT administration.

Operations and Customer Service Arranges for voice and dial-up data communications services to support critical functions. Procures stock to repair or replace damaged equipment. Restores full services in a timely manner.

Transmission Services Provides data communications facilities or circuits to support critical functions. Assists with restoration of cable and wire plant, as needed. Assists Information Systems and other departments with relocation and restoration of data facilities.

Appendix A – Recovery Facilities

The following facilities have been identified as designated recovery sites for restoration of processing under the MIT Business Continuity Planning strategy.

Emergency Operations Centers

The Emergency Operations Center is the location to be used by the Business Continuity Management Team and their support staff as a location from which to manage the recovery process. As such, the specific location will be selected by the Coordinator at the time of the occurrence. The following are the locations available:

Emergency Operations Center is located in

Central Administration building out of service – Immediately after evacuation of building, the BCMT will convene in Building to coordinate intial response to the event. If the problem appears to be long term – or affects the local area, the BCMT will activate the primary EOC in

.

Hot Sites (Operational data centers providing emergency computing resources)

Facilities provided: (See O&S FARM Team Plan)

Shell Sites (Computer conditioned space available to install equipment)

Facilities provided: (See O&SFARM Team Plan)

Appendix B – Category I, II & III functions

For details about each of these functions see the appropriate FARM Team Plan

Appendix C – Plan Distribution List

PLAN DISTRIBUTION MATRIX

ORGANIZATION RECIPIENT LOCATION MIT PLAN FARM

COPIES TEAM

COPIES

Business Continuity Management Team

Coordinators 2 1

Audit Division 2 1

Campus Police 2 1

Comptrollers 2 1

Accounting Office

CAO Payroll 2

Emergency 2 1

Response Team

Insurance 2 1

I/S Operations & 2 1

Systems

MIT News Office 2 1

Personnel Office 2 1

Physical Plant 2 1

President’s 2 1

Office

Safety Office 2 1

Telecommunications

2

1

Distributed Computing & Network Services

2

1

Administrative Computing Steering Committee
Chairman

2

1

2

1

FARM Team Coordinators
Bursars Office

1

Comptrollers Accounting Office

1

CAO – Payroll

1

Freshman Admissions Office

1

Lincoln Fiscal Office

1

Office of Financial Planning & Management

1

Purchasing & Stores

1

Office of the Registrar

1

Technology Licensing Office

1

Academic Computing Services

1

Administrative Systems Development

1

Computing Support

1

Services
Documentation & Training Services

1

1

I/S VP Office

1

1

Business Continuity Management Team

EARLY WARNING DUTY PROCEDURES

For information call:

BCMT Duty Person Procedures

This booklet contains instructions for the individuals currently assigned to be the active Business Continuity Management Team contact for emergency situations that may develop. The Duty Person is on call 24 hours a day for the one month assignment. The two people assigned as Duty Persons (DP) will be equipped with a pager and a cellular phone – both to be used for BCMT testing and emergencies only. Each person will pass the equipment to the next person on the Duty Person roster when the one month assignment ends. The equipment information is as follows:

Duty Person To just leave phone number To leave an 80 character message Number to call back dial: call and give PIN #

1

2

To reach by cellular phone: 1

2

Preparation Procedures

Upon receipt of the equipment, read the directions for the equipment and familiarize yourself with the pager and the phone. Ensure that phone batteries are charged properly (see instructions). Note: the pager takes one AAA battery, which lasts about a month.

Call the other duty person to ensure the phone is operable. Send a page to your own unit to ensure it is also functioning correctly.

At the end of your assignment, pass the equipment and documentation to the next person on the duty roster. Notify the BCMT coordinators, and by e-mail that the duty has been transferred. If an individual cannot serve, for a temporary period (i.e.. going to a conference) it is their responsibility to provide a trained alternate as their replacement. The BCMT Coordinators and the other person on duty are to be notified in advance about the replacement.

If there is a need to contact all the people on the Duty Roster send e-mail to:

, an Athena mail list maintained by the Information Security Officer for this purpose.

GUIDE TO BCMT ACTIVATION

  1. The first indication of a problem will probably be a page alert from Physical Plant Operations. This will be a short text message outlining the problem. Unless it’s obvious that the problem is long term and severe, wait 30 minutes (for things in the Operations Center to quiet down) and call them at . Tell them you’re calling for the BCMT and get the latest status about the problem reported by the page.
  1. Does the problem prevent normal access, occupation or usage of any part of any of the areas listed under the FARM Team Contact List, or does the disaster disrupt service provided by telephones, the network, or the mainframe computers?

If no, go back to sleep!

If yes, continue.

  1. Will expected recovery of the affected area last into normal business hours?

If no, go back to sleep!

If yes, continue.

  1. Does the FARM Team Coordinator of the affected service indicate that the disaster will affect that service? The FARM Team Contact List below provides the phone numbers of the FARM Team coordinators and the buildings their functions operate in.

If no, go back to sleep!

If yes, continue.

  1. ACTIVATE THE BCMT!

Call the coordinators first:

If they can’t be reached, call the BCMT members directly. The numbers are on the list attached. The BCMT has three possible assembly points:

If the problem is related, meet in the meeting room.

If related, meet in the Conference Room

All other problems, meet in the Emergency Operations Center

Business Continuity Management Team Duty Roster

Name MIT Home From To Pager No

Phone ID

1

24

2

10

FARM Team Contact List

# Area(s) FARM Team Contact Ext. Home E-mail Phone

10

Business Continuity Management Team

BCMT Contact Office Ext. Home Phone E-mail # BCMT 04 Coordinator BCMT 05 Coordinator Physical 02 Plant Campus 03 Police Operations 40 Center Supervisor Emergency 41 Response

Team 42 Safety 43 Office Safety 44 Office DCNS 45 DCNS 11 CAO 46 I/S O & S 06

Telecomm 47 14 MIT News 48 Office 49 Insurance 50 Physical 51 Plant

Cellular Phone Memory Assignments

# Contact Phone

00

01

02

 

Slide scripts

CIS527 Week #8_ P1 IT Risk Management Mitigating Risk with a Business Impact Analysis

Slide #

Slide Title

Slide Narration

Slide 1

Introduction

Welcome to IT Risk Management.

In this lesson we will discuss Mitigating Risk with a Business Impact Analysis

Next slide

Slide 2

Topics

The following topics will be covered in this lesson:

What a business impact analysis is

What the scope of a business impact analysis is

What the objectives of a business impact analysis are

What the steps of a business impact analysis are

What mission-critical business functions and processes are

How business functions and processes map to IT systems; and

What best practices for performing a business impact analysis are

Next slide

Slide 3

What is a Business Impact Analysis?

A business impact analysis (BIA) is a study used to identify

the impact that can result from disruptions in a business.

A BIA focuses on the failure of one or more critical IT functions. A BIA helps identify the systems critical to the survival of an organization. Survivability is the ability of a company to survive loss due to a risk. Some losses are so severe that they can cause the business to fail if they aren’t managed.

Several terms relevant to BIAs include:

Maximum Acceptable Outage (MAO) The MAO identifies the maximum acceptable downtime for a system.

Critical Business Functions (CBFs) Any functions considered vital to an organization.

Critical Success Factors (CSFs) Any element necessary to perform the mission of an organization.

So, what is a critical IT function? Any stakeholder can determine that a business function is critical. If the stakeholder determines that the loss of the function will cause an unacceptable loss, it is a critical function.

When a function is designated as critical, the stakeholder needs to dedicate resources to protect it such as money and personnel.

Additionally, a law could dictate that a function be

considered critical. An example is the Health Insurance

Portability and Accountability Act. HIPAA mandates

the protection of health-related information. Access controls

and other protection measures could be considered critical

to ensure HIPAA compliance.

Next slide

Slide 4

What is a Business Impact Analysis? (continued)

Because the BIA is a data-gathering process, the different methods used to gather the data should be considered. There are multiple methods available.

Interviews can be conducted with key personnel.

The planned interviews should include questions that focus on CBFs and the MAO of supporting resources. Another method is to use questionnaires, forms, or surveys, which should focus on one process at a time. These can be paper-based or computer-based.

There is usually one data collection method used. Some people may have a lot of information and an interview may be

appropriate. But many people are bogged down in daily meetings and other organizational activities that require a great deal of their time, so in these cases, the questionnaires or surveys that generally take less time may be more effective for gathering of data.

Next slide

Slide 5

Defining the Scope of Your Business
Impact Analysis

As with any project, it is important to define the scope of a BIA early in the process. The scope defines the boundaries of the plan. Defining the scope helps ensure that the BIA is focused and ensures that the correct functions are analyzed. The scope is affected by the size of the organization.

An example is an organization and a Web site set up for that business. The scope of the BIA would cover all functions of the online Web site including the functions that support a customer’s visit and purchase, along with the functions

that support the shipment of the product.

Next slide

Slide 6

Objectives of a Business Impact Analysis

The overall objective of the BIA is to identify the impact of outages. Specifically, the goal is to identify the critical functions that can affect the organization. After identifying those functions, the critical resources that support these functions need to be identified.

Each resource has an MAO and an impact if it fails. The ultimate goal is to identify the recovery requirements. The steps in the process are to identify the owners and experts, business functions, critical resources, MAO and impact, and recovery requirements.

An indirect objective of the BIA is to justify funding. After the recovery requirements are identified in the BIA, the BCP will identify controls. If the impact is high, it is cost effective to spend money to prevent the outage.

Next slide

Slide 7

Objectives of a Business Impact Analysis (continued)

Unless a person owns a particular process, it is not always apparent what the critical functions are. For example, if a person is a security expert, he or she may not know the critical functions of a Web site. The Web server is the obvious component, but there are others. By interviewing or surveying the experts, insight into all the components that support the Web server can be gained. It is often worthwhile to use this data to identify the specific steps for the process, which might include:

The customer visits the Web site.

The customer browses the product catalog.

The customer selects a product.

The customer checks out.

A message is sent to the order processing application, and

The order is processed

In this example, the critical business functions are:

The customer accessing the Web site;

The Web server accessing the database server; and

The order-processing application receiving and processing the order.

With this information, the critical resources can be identified.

Next slide

Slide 8

Objectives of a Business Impact Analysis (continued)

The critical resources are those that are required to support the CBFs. Once the CBFs are identified, they can be analyzed to determine the critical resources for each.

Following the example of the Web site, identification of the critical resources from the CBFs can be made. One of the CBFs identified earlier was the customer accessing the Web site.

The following IT resources are required to support this function:

Internet access,

Web server,

Web application,

Network connectivity, and

Firewall on the Internet side of the DMZ.

The second CBF is the Web server’s ability to access the database server. The database server hosts product information and customer information. The customer information is used when a customer makes a purchase and to target advertising for the returning customer. The following IT resources are required to support this function:

Web server,

Web application,

Database server,

Network connectivity, and

Firewall on the Internet.

The third critical function is the order processing application, which needs to receive orders from the database server and have the ability to be able to track the order until delivery.

The following IT resources are required to support this function:

Server hosting the order processing application,

Database server,

Warehouse application,

Network connectivity, and

Internet access.

Next slide

Slide 9

Objectives of a Business Impact Analysis (continued)

Once the critical business functions have been identified along with the IT resources that support the CBFs, attention is turned to the MAO and impact. The maximum acceptable outage (MAO) is sometimes referred to as the maximum tolerable period of disruption (MTPD).

The MAO helps determine which CBFs are needed to recover and restart as soon as possible after a disaster, and identifies the specific resources needed to restart the CBF.

The impact on the business is monetary, but it doesn’t need to be expressed as money. Instead, the impact is often expressed as a relative value such as High, Medium, and Low or can be expressed as a number such as 1 through 4. Once the impact level is identified, it can be matched with an MAO.

When calculating the MAO for an organization, it is important to consider both direct and indirect costs.

The direct costs are usually easier to calculate because some of these costs are readily apparent.

The following list shows some of the direct costs:

Loss of immediate sales and cashflow. This is the most obvious loss.

Equipment replacement costs. If equipment is damaged, it will need to be repaired or replaced.

Building replacement costs. If a building is lost due to a fire or natural disaster, it will need to be rebuilt or replaced.

Penalty costs for late delivery. Service level agreements (SLAs) specify expected levels of service.

Penalty costs for noncompliance issues. Some laws impose penalty costs for noncompliance.

Cost store-create or recover data. Data lost during an outage needs to be re-created or restored.

Salaries paid to staff who are idled due to outage. If an outage prevents normal work, workers will still be on the clock. In other words, you’ll be paying workers to perform jobs they can’t perform.

Next Slide

Slide 10

Objectives of a Business Impact Analysis (continued)

It is a little harder to identify indirect costs. However, their value also affects the impact value. The following list shows some of the indirect costs to consider:

Loss of customers. Customers who can’t purchase from you may purchase from the competitor.

Loss of public goodwill. The outage may cause your organization to look less desirable.

Costs to regain market share. When customers and goodwill are lost, the company loses market share.

Costs to regain positive brand image. If the company’s brand is tarnished, steps need to be taken to repair it. It takes a lot of advertising money to repair a tarnished reputation.

Loss of credit or higher costs for credit. When an outage affects a company’s cash flow, it can also affect the company’s credit rating.

Lost opportunities during recovery. While your organization is dealing with the outage, resources are occupied.

Next slide

Slide 11

Check Your Understanding

Slide 12

Objectives of a Business Impact Analysis (continued)

The recovery requirements show the time frame in which systems must be recoverable and identify the data that must be recovered. An example would be that it may be acceptable for some data that is not critical in nature to be lost while other data loss is not acceptable.

There are two primary terms related to the recovery requirements – Time Objective (RTO) and Recovery Point Objective (RPO). Although the RTO applies to any systems or functions, the RPO applies to data only. More specifically, the RPO addresses data housed in databases.

The RTO is the time in which the system or function must be recovered and would be equal to or less than the MAO. Another way of thinking of RTO and RPO is as time critical and mission critical. The RTO identifies the time when the system is restored. The RPO identifies data that is mission critical. Some processes must be delivered in a timely manner, requiring a short RTO, while other processes can be delayed, as long as all of the data is recovered.

Also to be considered is that other databases may not change that much and their changes may be manually reproduced. If there aren’t many changes and they can easily be reproduced, more data loss can be accepted. For example, a database that is manually updated about five times a week has updates with a paper trail that displays what needs to be reproduced. Because the updates have a paper trail, the database can be reproduced and then the updates reproduced successfully.

Next slide

Slide 13

The Steps of a Business
Impact Analysis Process

The majority of the work of a BIA is gathering the data

that surrounds the critical business functions within the scope of the BIA. Once the data is gathered, an analysis is conducted. The end stage is the publication of the BIA report. Some organizations may include recommendations to meet recovery times, although that is not technically part of a BIA.

The overall steps of a BIA include:

Identify the environment,

Identify stakeholders ,

Identify critical business functions,

Identify maximum downtime,

Identify critical resources,

Identify recovery priorities, and

Develop BIA report

The most important point is that the goal of the BIA is to identify the critical resources and recovery priorities.

Next slide

Slide 14

The Steps of a Business
Impact Analysis Process (continued)

The first step identifies the overall IT environment. A good understanding of the business function, including the number of customers and the number of transactions, is needed. If sales revenues are generated, the sales amounts should also be known because the sales revenue translates to lost sales during an outage.

It is possible to perform a BIA on a critical business function that doesn’t generate sales revenue. For example, email is a critical business function for many organizations. An

email system may serve 5,000 employees and could pass tens of thousands of e-mails daily. Even though it doesn’t generate any direct sales revenue, the email system may be considered critical.

Stakeholders also need to be identified. Stakeholders are those individuals or groups that have a direct stake or interest in the

success of a project. For example, a vice president of sales would have a direct stake in the success of sales. A stakeholder can also help ensure that there are adequate resources available.

The critical functions are those that will have a direct impact on the profitability or survivability of an organization. Some BIAs are designed to focus on a critical function from the beginning.

Critical resources are the resources needed to support the critical systems and the critical system processes. These resources could include hardware, such as servers or routers, as well as software, such as the operating system and applications.

When identifying critical resources, it is important to include the supporting infrastructure.

The maximum downtime is the maximum acceptable outage (MAO). Once the MAO is identified as critical resources, the MAO will be able to be identified for each of them.

In addition to identifying a MAO, an impact statement should be included. The impact statement identifies the effect of the loss and can be stated as the impact directly by identifying what cannot be done in case of a loss. This impact can also be stated in monetary terms.

This part of the BIA identifies the most important critical systems, and the least important critical systems. The highest priorities are assigned based on the shortest MAOs.

Next slide

Slide 15

The Steps of a Business
Impact Analysis Process (continued)

The BIA report is the report that compiles all of the data collected. S-P eight hundred dash thirty four includes a template that can be used as a guide for the BIA.

The template includes the following sections:

Preliminary system information,

Systems points of contact,

System resources,

Critical roles,

Table linking critical roles to critical resources,

Table identifying resources, outage impact, and acceptable outage time, and

Table identifying recovery priority of key resources.

Next slide

Slide 16

Identifying Mission-Critical Business Functions and Processes

An important step in the BIA is identifying the mission-critical business functions and processes. One of the most important points in this analysis is that the experts have the key information. Different data collection methods will need to be used to get this information.

Mission-critical business functions are any functions that are considered vital to an organization and are derived from critical success factors, or CSFs. CSFs are any elements necessary to perform the mission. CSFs are a limited number of areas where successful results will ensure success for the organization.

Processes are usually the underlying actions that contribute to the CSFs. Certain processes result in successful CSFs, and successful CSFs result in successful CBFs. Consider a company that generates the majority of revenue from online sales. Sales from the Web site are a CBF, but identification of the underlying factors and actions needed to sell products needs to be made.

For a company that sells widgets online, some of the underlying CSFs could be:

Best widgets available,

Motivated employees,

Customer satisfaction, and

Effective advertising.

Different processes support each of these CSFs. For example, some of the processes that support customer satisfaction include:

Satisfying buying experience,

Competitive pricing, and

On-time delivery.

Many companies document these processes with work flows. If work flows exist, they can easily be used to determine the steps in the processes.

Next slide

Slide 17

Mapping Business Functions
and Processes to IT Systems

Once critical business functions and processes have been identified, they need to be mapped to the IT systems. After the mapping, the determination can be made as to what the recovery options are.

In the example used for shipment of products, there were three primary systems. First, employees accessed the warehouse application. If it failed, they couldn’t identify

the products to ship. Second, the warehouse application accessed a database. If the database server failed, the same problem occurred. The employees couldn’t identify the products to ship. Last they needed a link between the Web server accepting the orders and the database server. If this failed, new orders could not be shipped.

Identification of the priority of these systems can be made by using the same scale as presented for matrix-type priorities. A priority of one is the highest priority. A priority of five is the lowest priority. If the database server or the warehouse application servers are down, shipments can’t occur. However, some delay in shipments is acceptable. If the connection with the Web server is broken, new orders aren’t passed to the

warehouse. However, the warehouse workers can still process existing orders.

Next slide

Slide 18

Best Practices for Performing a BIA for Your Organization

When performing BIAs, several different best practices can be used, which include:

Start with clear objectives,

Don’t lose sight of the objectives,

Use a top-down approach,

Vary data collection methods,

Plan interviews and meetings in advance,

Don’t look for the quick solution,

Consider the BIA as a project, and

Consider the use of tools

Next slide

Slide 19

Check Your Understanding

Slide 20

Summary

We have reached the end of this lesson. Let’s take a look at what we’ve covered.

First we considered what a business impact analysis is. The BIA is a valuable tool that can help identify critical systems and resources.

Next we discussed what the scope of a business impact analysis is, which is essentially determining the clear objectives using a top-down approach.

There are many the steps in the process of determining the business impact analysis that should be followed and applied in order for the data collection to be adequate and accurate. We outlined these steps next.

We then considered what the mission-critical business functions and processes are, as well as how the business functions and processes map to IT systems.

Lastly, we reviewed several best practices for performing a business impact analysis for an organization.

This completes this lesson.

CIS527 Week #8_ P2 IT Risk Management Mitigating Risk with a Business Continuity Plan

Slide #

Slide Title

Slide Narration

Slide 1

Introduction

Welcome to IT Risk Management.

In this lesson we will discuss Mitigating Risk with a Business Continuity Plan.

Next slide

Slide 2

Topics

The following topics will be covered in this lesson:

Business Continuity Plan (BCP);

Elements of a BCP;

Using BCP to mitigate an organization’s risk; and

Best practices for implementing a BCP.

Next slide

Slide 3

What is a Business Continuity Plan?

A Business Continuity Plan (BCP) is a plan that is designed to aide an organization in continuing to operate during and after a disruption. The disruption can be manmade or a natural disaster.

The goal of the BCP is a continuation of operations. BCPs can address any type of disruption or disaster. For example, organizations that operate by a southern coast plan for hurricanes. Businesses in the heartland’s “tornado alley” plan for tornadoes. Californians plan for earthquakes, while every organization plans for fires.

The scope of the BCP includes a global view of the organi-

zation as well as the IT systems, facilities, and personnel.

The BCP examines all of the elements that can impact an organization and identifies the elements that are mission-critical and need to continue to operate. Non-mission-critical elements that do not need to continue are not

addressed by the BCP.

Next slide

Slide 4

What is a Business Continuity Plan? (continued)

A Business Impact Analysis (BIA) is included as part of a BCP.

The BIA has several key objectives that directly support the BCP, which include:

Identify critical business functions (CBFs);

Identify critical processes supporting the CBFs;

Identify critical IT services supporting the CBFs, including any dependencies; and

Determine acceptable downtimes for CBFs, processes, and IT services.

The BCP also includes disaster recovery plans that help the organization restore IT services after a disaster. Any organization can create a BCP using procedures that

match the needs. The overall steps of a BCP include:

Charter the BCP and create BCP scope statements;

Complete business impact analysis (BIA);

Identify countermeasures and controls;

Develop individual disaster recovery plans (DRPs);

Provide training;

Test and exercise plans; and

Maintain and update plans

Next slide

Slide 5

Elements of a BCP

BCPs are large, comprehensive documents that include many elements and cover many contingencies. There is not a single format that will cover all requirements for all organizations. There are some guides that suggest the inclusion of certain elements which include:

Purpose;

Scope;

Assumptions and planning principles;

System description and architecture;

Responsibilities;

Notification/activation phase;

Recovery phase;

Reconstitution phase ;

Planning, training, testing, and exercises; and

Plan maintenance

Next slide

Slide 6

Elements of a BCP (continued)

The purpose of the BCP is to ensure that mission-critical elements of an organization keep operating after a disruption. The BCP is implemented when a disruption occurs or is imminent. The BCP remains in place until the restoration of normal operations.

Only critical business functions are maintained during the disruption. The BIA identifies the CBFs and their priorities. The BCP ensures that all the elements are in place to maintain the CBFs.

The BIA also includes acceptable outage times. Some CBFs may need to be kept operational with minimal outage. Other CBFs may have lower priorities. Depending on the recovery time objectives identified in the BIA, the lower priority CBFs may be down for hours or even days.

The scope statement includes several key items which can include elements such as the location, systems, employees, and vendors. Only the critical systems identified in the BIA should be included. Although a BCP will take a global view of the organization, it does not have to cover the entire organization.

Next slide

Slide 7

Elements of a BCP (continued)

Every BCP needs to include basic assumptions and planning principles. These are very helpful in the initial development of the BCP and are used in the implementation phases.

Categories of the basic assumptions and planning principles include the incidents that are planned to be addressed in the plan, such as strategy, priorities, and required support.

A key planning principle is the length of time expected to continue operations under the BCP before returning to normal operations. For example, in considering a hurricane, the company could plan on continuing operations under the BCP for seven days following a hurricane.

Many BCPs identify specific incidents that are included and excluded, meaning that the BCP may be designed to address specific disruptions due to hurricanes or earthquakes. The BCP may also be designed to address generic incidents, such as power loss from any cause.

The strategy of the BCP identifies some of the key elements of the plan that may include location, notification, transportation, and more. If the organization is in a single location, the strategy is to address this single location, but if the organization is in several locations, strategy for each location will need to be identified.

Next slide

Slide 8

Elements of a BCP (continued)

The BIA identifies critical business functions, critical resources, and their priorities. The BCP ensures that efforts focus on returning the top priority systems first that have the most resources dedicated to them.

The BCP requires support during every stage. To begin with, the BCP requires management support. If it is not supported by management, the required input and support from personnel will not be able to be obtained and the required funding will not be accessible. Without support from top-level management, the BCP stands a high risk of failing.

The BCP identifies critical business functions that need to remain operational during the disruption. Each of the CBFs have individual systems that support it. Documentation of current system descriptions needs to be detailed enough to identify the critical system and the supporting architecture. If the documentation isn’t available, or is out of date, maintaining and recovering the CBFs becomes much more difficult.

While documenting the CBF systems for the BCP, elements needing to be addressed in the recovery plan are also essential. For example, documentation may show that a system must maintain connectivity via a Wide Area Network (WAN) link to stay operational. If the plan doesn’t include an alternative, this WAN link becomes a single point of failure.

Next slide

Slide 9

Elements of a BCP (continued)

This next section provides an overview description of a CBF from the big-picture view. The description of a critical database hosted at the headquarters of an organization is considered for analysis:

The headquarters hosts the Sales database on a database server. Management at the headquarters uses this database to identify and track sales throughout the company. This database is critical to these different business functions.

Management at headquarters uses this database to identify and track sales throughout the company.

Ordering and production personnel use this database to order and track products shipped to stores.

Employees at any store query the database to determine if an item is in stock locally or at another store. The database hosts inventories within each store.

Sales at each store are recorded on the store’s local database. Each store database synchronizes with the headquarters database server once an hour.

With this description, each location has a database. This database is synchronized with the database at headquarters. It does not provide details, but does provide enough information for anyone to understand the big picture.

Next slide

Slide 10

Elements of a BCP (continued)

The functional description provides more details of the systems and builds upon the overview.

Many systems interact with other critical systems, so it is valuable to include figures whenever possible. Diagrams are often included in the BCP to help clarify the descriptions provided.

The description provided gives more details of the architecture such as the store names and locations and details on the WAN links. If there were redundant WAN links, those would be described also.

Details on the headquarters’ server are also important. This description includes the server name, operating system, and database application used. For example, the server could be running Windows Server 2008 with SQL Server 2008.

If the server includes any fault-tolerance capabilities, they would be included here. For example, a two-node failover cluster allows one server to fail without affecting the services provided by the database. Servers may also include redundant array of inexpensive disk (RAID) configurations. With RAID, drives can fail but the system will continue to operate.

Next slide

Slide 11

Check Your Understanding

Slide 12

Elements of a BCP (continued)

The BCP should list all the critical components for the system. This data needs to be included because it makes it clear which components are needed for the CBF and it provides a list that can be used to restore the system from scratch.

This list includes any equipment, such as servers, switches, and routers. Because the servers may need to be rebuilt from scratch, the BCP should list the operating system and any applications needed to support the system. If an image is used to rebuild servers, the image will list the version number.

Data can include a database hosted on the system along with any files such as documents or spreadsheets. The list can also include any needed supplies, such as printer paper and toner. For some systems, the list can include technical supplies, such as special oils for machinery or tools needed for maintenance. Whenever possible, the location of these items should be included as well.

Next slide

Slide 13

Elements of a BCP (continued)

Required connectivity with other systems is an important element to document in the BCP. Connectivity can be from the internal network and via the Internet as well as via dedicated WAN lines. Communication can also be via simple phone lines.

External connections often use lines from telecommunications companies. Internet Service Providers (ISPs) often provide more than just access to the Internet. They may lease lines used for WANs and virtual private networks (VPNs).

Any required communication links should be documented. For example, if a database receives updates from other databases using VPN lines, those should be included.

The laying out of responsibilities also needs to be included within a BCP. When responsibilities are assigned, it helps to clarify all things concerned. When tasking is not completed or behind schedule, it is easier to get it back on track when the responsibilities have been clearly defined and documented.

Employees in the organization will fill specific roles in a BCP, such as the program manager, BCP coordinator, BCP team leads, and BCP team members.

Next slide

Slide 14

Elements of a BCP (continued)

A BCP program manager (PM) usually manages multiple BCP projects within a large organization. For example, a large organization could have multiple locations and BCPs

for each location. A BCP coordinator manages a BCP. The BCP program manager ensures that each BCP is progressing as expected.

The BCP coordinator is in charge of a specific BCP. This individual can have two roles, depending on the stage of the BCP:

Before the BCP is completed and activated, this person is responsible for developing and completing it.

When the BCP is completed and activated, the BCP coordinator is responsible for declaring the emergency and activating the BCP.

A BCP cannot be planned, implemented, and executed by a single person. Instead, teams are put together to help the process. There are several possible teams that can be put in place depending on the type of situation, which include the:

Emergency Management Team (EMT);

Damage Assessment Team; and

Technical Recovery Team

The BCP may identify additional personnel who have other responsibilities. These personnel would vary from one organization to another and may include:

Critical vendors;

Critical contractors; and

Telecommuters

Next slide

Slide 15

Elements of a BCP (continued)

The order of succession as to how and when the key personnel are initiated also comes into play. In some disasters, the key personnel may not be available. For example, the chief executive officer (CEO) may want to be informed by the BCP coordinator prior to activating the BCP, but the CEO may be on vacation in another country. The coordinator has to know who to contact if this scenario exists.

The BCP would include an order of succession to address these types of situations. The chain of command could have the

order of succession as follows:

CEO;

Chief information officer (CIO);

Vice presidents (VPs) in the following order: service delivery, sales, marketing; and

Department directors in the following order: service delivery, sales, marketing

If the CEO or the CIO were on site, he or she would be contacted first. If the CEO or CIO was not there, the VP of Service Delivery would be contacted, and so on. The delegation of authority also needs to be factored in and identified.

Next slide

Slide 16

Elements of a BCP (continued)

The BCP coordinator declares the notification/activation phase, which is the point when the disruption has occurred or is imminent. Comparing hurricanes and earthquakes

shows how this phase can differ depending on the disruption.

Weather forecasters are able to give warnings several days

in advance for many hurricanes. Although the forecasts aren’t

100 percent accurate, they do provide advance warning for an

organization to prepare in case it does hit. The BCP can be written so that different steps are taken at different stages.

With this in mind, the BCP for an earthquake will have a much different notification/activation phase than that of a hurricane. The BCP coordinator will still activate the BCP to ensure that everyone is notified.

Notification procedures can vary from one organization to another, but the most important step is to ensure that the BCP coordinator is notified of any disruption or disaster covered by the BCP.

The Damage Assessment Team (DAT) is responsible for assessing the damage and reporting the damage to the BCP coordinator. The team’s primary goal is to identify

the extent of the damage as quickly as possible.

Although the BCP coordinator is responsible for activating the BCP, there are some criteria. The BCP coordinator doesn’t just make the decision based on a hunch.

The following items are valid reasons to activate the BCP:

Safety of personnel;

Damage to the building affecting critical business functions;

Loss of operations of one or more critical business functions; and

Specific criteria identified in the BCP, such as a hurricane warning or an earthquake.

Alternate assessment procedures may be called for because in some instances, the DAT may not be able to assess the damage directly. If necessary, the team can do an indirect assessment based on the available information.

Many organizations use a notification roster. This form identifies the name and contact information of appropriate personnel, and can be used in many different ways. The primary purpose is to contact personnel when necessary.

Next slide

Slide 17

Elements of a BCP (continued)

The recovery phase is the next step after the activation In this phase, the Technical Recovery Team (TRT) members go to work with several goals which include:

Restore temporary operations to critical systems;

Repair damage done to original systems; and

Recover damage to original systems

Once the TRT has completed its job, the critical operations will be functioning. TRT only focuses on the CBFs identified in the BIA.

The recovery goal is dependent on several factors. The goal could be to recover a portion of the functionality of a CBF or

the recovery goal could be much more complete.

The success of the recovery phase is based on the recovery planning done beforehand. As someone once said, “It wasn’t raining when Noah built the Ark.” In other words,

it’s too late to plan when the disaster strikes. The plans must be made earlier. Recovery planning often takes the format of a disaster recovery plan (DRP).

The TRT will perform the work to achieve the recovery goals. The DRP guides the work, but it is possible that the work will be in phases, depending on the depth of the recovery. This is especially true when operations have to be relocated to a different location.

The TRT lead will oversee the work done by the TRT. This lead will need to be very familiar with existing DRPs and may even have written them. The TRT performs the recovery work. The extent of its work will depend on the extent of the damage and will also depend on whether or not operations are moved.

Next slide

Slide 18

Elements of a BCP (continued)

Although creating the BCP plan is a huge step, it is not enough on its own. Steps need to be taken to train personnel about the plan and test and exercise the plan.

The overall goals of these steps include:

Training: Teach people details about the BCP.

Testing: Show that the BCP will work as planned.

Exercises: Show how the BCP will work.

The primary people to train for the BCP are the members of the teams who should have a good understanding of what their actual responsibilities are when the BCP is activated. The BCP coordinator is responsible for ensuring all personnel are trained. Training sessions should include:

Training for all teams;

EMT training;

DAT training; and

TRT training

Training should be conducted at least annually, and if the BCP or systems change, training will need to be done more often.

Next slide

Slide 19

Elements of a BCP (continued)

BCP testing should be completed at least annually. The goal of the testing is to show that the steps within the BCP are achievable.

Testing may include the following steps:

Test individual steps within each phase of the BCP;

Test all disaster recovery plans; and

Locate and test alternate resources

Testing should reveal any problems or deficiencies with the plan. This includes any problems with the steps, resources, or personnel.

A tabletop exercise brings all the members together to talk though the process. In this exercise, all of the team members are brought together and asked to sit around a conference room table. The BCP coordinator then presents a scenario to the team members. Team members identify what they would do to respond to the scenario.

A functional exercise evaluates specific functions within

the BCP. Consider a situation where the BCP identifies an alternate location for some critical functions. A functional exercise can be performed to restore and recover all the critical resources at the alternate location.

A full-scale exercise is more realistic than either tabletop or functional exercises. This exercise simulates an actual disruption of critical business functions. Team members do not

sit around a table discussing what they would do, but instead they take action.

Full-scale exercises require many resources to complete. The primary resource is personnel. However, full-scale exercises provide the most realistic view of how team members will respond to an actual emergency.

Next slide.

Slide 20

The BCP coordinator is responsible for the BCP plan. This also includes reviews and updates of the BCP. There are several specific reasons to update the BCP which include:

Changes to the IT infrastructure;

Regular updating, such as annually; and

After testing or exercises

Revisions to the BCP need to be documented to ensure that people can easily tell if the document has been modified, as well as ensure that they have the most up-to-date version.

Many organizations use a simple version control page.

The BCP coordinator is responsible for reviewing the BCP at least annually, even if there are no known changes. This review ensures the BCP still addresses and meets all of the organization’s requirements. The review includes the BIA to ensure that critical business functions have not been modified and are still considered critical. It also includes operational and security requirements, along with a review of any of the

individual processes, such as recalls, and more technical procedures, such as DRPs.

The review of the BCP should also include information from training, testing, and exercises.

Next slide

Slide 21

How Does a BCP Mitigate
an Organization’s Risk?

BCPs mitigate an organization’s risk by ensuring

that the organization is better prepared for disasters.

If a disaster occurs, the organization meets the disaster with the

benefit of forethought and planning. On the other hand,

if an organization doesn’t have a BCP, managers must

make spur-of-the-moment decisions.

Pilots are often praised for their ability to react coolly

in the face of disaster. Pilots train for disasters so many times that they know what needs to be done. Even amidst

a crisis, they calmly identify the best steps to take to reduce the impact of disasters. Consider that if a pilot never trained for a disaster and suddenly had two jet engines go dead, he would be tempted to try anything to get things going again.

Similarly, the BCP helps an organization plan and train for disasters. No one wants to see a disaster hit, but if a disaster does arrive, the organization is much better prepared to address it directly if a BCP is in place.

Next slide

Slide 22

Best Practices for Implementing a BCP for Your Organization

When implementing a BCP, you can use several different best practices. The following list shows many of these:

Complete the BIA early;

Exercise caution when returning functionality from alternate locations;

Review and update the BCP regularly;

Test all the individual pieces of the plan; and

Exercise the plan

Next slide

Slide 23

Check Your Understanding

Slide 24

Summary

We have reached the end of this lesson. Let’s take a look at what we’ve covered.

First we considered what a business impact analysis is. The BIA is a valuable tool that can help identify critical systems and resources.

Next we discussed what the scope of a business impact analysis is, which is essentially determining the clear objectives using a top-down approach.

We then took a more in-depth look at the many elements of the business impact analysis. There are many steps in the process of determining the business impact analysis that should be followed and applied in order for the data collection to be adequate and accurate. We considered what the mission-critical business functions and processes are and how the business functions and processes map to IT systems.

From there, we moved to a brief look into using BCP to mitigate an organization’s risk. To summarize, a BCP helps an organization train for disasters, ensuring that they are prepared for a multitude of risks.

Lastly, we reviewed the best practices for performing a business impact analysis for an organization.

This completes this lesson.