Across today’s data-driven landscape, a dangerous assortment of downtime threats can wreak havoc on data center operations, potentially causing serious business impacts. From natural disasters to cyberattacks to basic human error, the toll can be hefty for organizations that are not properly prepared. Developing effective disaster recovery and business continuity strategies is a crucial aspect to being able to successfully weather the unexpected. Yet despite the prevalence of outages ― 80% of data center managers and operators have experienced at least one in the past three years, according to the Uptime Institute’s 2022 Data Center Resiliency Survey ― many companies make the mistake of not prioritizing a data center disaster recovery plan until it’s too late.
Importance of Disaster Recovery Planning
Because unpredictable events can result in devastating downtime at any moment, it has become necessary to implement data center disaster recovery and business continuity plans. Although people often interchange the two terms, they are two distinct ― yet equally important ―processes. While the goal of disaster recovery planning is to preserve IT infrastructure, uptime and data, a business continuity plan is a broader, more extensive strategy focused on maintaining regular functions during a disaster. Disaster recovery planning helps companies maintain technical operations and quickly restore the ability to operate; business continuity concentrates on preparing for these situations cross-departmentally to prevent data loss and critical information. To successfully navigate unforeseen disruptions, a successful data center recovery plan will incorporate both approaches.
Undeniably, the cost of data center downtime can be monumental; experts at Gartner assess the price tag at an average of $5,600 per minute. Yet organizations need to recognize that the effects of downtime and the loss of data availability can extend far beyond monetary losses ― service disruptions often damage an organization’s brand reputation, impact customer loyalty and leave data center operators vulnerable to expensive Service Level Agreement (SLA) payouts. However, with the proper business continuity and disaster recovery plans in place, organizations are much more able to mitigate the impacts of severe weather events and other types of disasters.
How to Prepare for Potential Disasters
An effective disaster recovery plan focuses on implementing procedures and steps that will maximize uptime and minimize risk when the unexpected occurs. The following measures are key when developing a data center disaster recovery plan:
1. Install and Maintain a Reliable Uninterruptible Power Supply
By supplying an adequate window to safely shut down sensitive equipment, uninterruptible power systems (UPSs) safeguard against downtime, data loss and hardware damage. But protecting equipment during a complete blackout isn’t the only reason you need a UPS. Depending on the topology, these systems also shield connected devices from common power problems and unsafe output voltage fluctuations that can damage electronics, reduce lifespan and negatively impact performance.
Many data center UPSs offer the ability to add extended battery modules (EBMs) that enable critical systems to remain operational for up to several hours, depending on the load and UPS parameters. However, they cannot power a facility indefinitely in the event of a large-scale outage. Organizations that rely on local data to remain up and running should seriously consider adding a dedicated generator onsite.
2. Store Backup Data in a Disaster Recovery Data Center
While some larger organizations operate their own redundant site as a backup data center, the cost to purchase and maintain these types of facilities is very high. Because of this, many companies opt to engage with a colocation provider, which supplies the complex and expensive infrastructure needed for disaster recovery safely and cost-effectively. Also referred to as a disaster recovery data center, these facilities are built with redundancy and resiliency to protect against downtime in the event of any disaster. By delivering the same connectivity options and cloud capabilities that companies are already using, disaster recovery data centers prevent the loss of accessibility and functionality when a disaster redirects IT loads.
3. Take System Inventory and Determine Downtime Tolerance
Another essential task when developing a disaster recovery plan is for an organization to take a system inventory and assess its individual downtime tolerance. This step is not only imperative to understanding potential threats, but to determining the optimal protection and recovery solutions. In addition, companies should perform a comprehensive risk assessment and analysis, identifying potential threats and vulnerabilities that could impact operations, such as natural disasters, cyber-attacks, hardware failures and other unforeseen events.
High availability and redundancy measures are vital components of disaster recovery and business continuity strategies to minimize downtime and ensure continuous service availability. Several online downtime calculators can help an organization hone in on their level of downtime risk, including this form from Datto.
4. Identify System Weaknesses and Recovery Objectives
Pinpointing potential areas of weaknesses within an organization is important when formulating a disaster recovery plan to help determine the most appropriate solutions. Equally vital is understanding your recovery objectives. To complete this assessment, many companies rely on industry standards such as the recovery time objective (RTO) and recovery point objective (RPO).
The recovery time objective is the targeted duration of time between the event of a failure and the point where operations resume, while the recovery point objective represents the maximum length of time permitted to restore data. In a data center environment, for example, RPO might designate the amount of time between data backups or business financial transactions.
5. Train Employees
When the unexpected happens, everyone must understand exactly what to do, and what not to do. Organizations should share their disaster recovery and business continuity plans with all employees, and train them on how to prepare for risks and respond when a power outage or natural disaster occurs. It is important to assign roles and ensure that everyone understands their specific responsibilities, as well as the established companywide procedures and steps to follow. Formulating a communication plan and performing practice tests are also important to properly prepare for disasters.
Choosing the Right UPS Equipment
Selecting the optimal UPS is a critical aspect of a data center disaster recovery plan. While there are numerous types, sizes, and configurations of UPS solutions, it is essential to deploy a model suitable for protecting mission-critical applications. The primary UPS topologies include:
- Online UPS – An online or double-conversion UPS is designed to deliver continuous protection against all nine of the most common power problems, supplying consistent clean power regardless of any incoming instabilities. This type of UPS is the optimal choice for mission-critical applications or those involving highly sensitive equipment, such as data centers, communications hubs and other installations where continuous, clean power is a business-critical requirement.
- Line-Interactive UPS – A UPS with line-interactive topology is designed to shield connected devices from power failures, sags, surges, voltage spikes and voltage drops. Typically used to safeguard enterprise networks and IT applications, line-interactive models monitor the quality of incoming power and react to fluctuations.
- Offline UPS – Also referred to as a standby or passive UPS, offline technology offers the most basic type of protection and is not typically deployed in data center applications. A standby UPS will switch to the battery to safeguard connected equipment when power fails, as well as adjust for routine sags and surges, though the switch is not instantaneous. Because of these limitations, offline UPSs are best suited for non-critical and less demanding home networks and office environments that are not subjected to frequent disruptions.
Keep in mind that installing a data center power protection solution isn’t enough; it must also be properly maintained. Regularly scheduled preventive maintenance (PM) is key to lowering total cost of ownership, optimizing efficiency, averting downtime, and actively minimizing the chance of equipment failure and costly repairs. Even more, studies have shown that more than two-thirds of downtime events stem from preventable causes ― including insufficient maintenance and components wearing out ― and up to 20% fail because of bad batteries.
Protect Your Data With Reliable UPS Equipment From Unified Power
Perhaps the most important aspect of disaster recovery planning is to initiate the process before you need it; don’t wait for a data center disaster to begin considering and formulating your strategy. The right partner can help ensure that you have the best solutions in place and that your critical power infrastructure is properly maintained.
Backed by decades of practical expertise, Unified Power’s service team is dedicated to the ongoing, optimal performance of your data center UPS systems and generators. Our PM strategy establishes procedures for scheduled maintenance, safeguarding against downtime risks and ensuring that inspections are not postponed or forgotten. Preventive maintenance simplifies processes, cuts costs, and enhances overall data center availability. Most data center sites require at least one or two PMs per year; however, additional visits may be warranted if the environment is susceptible to high heat, dust, contaminants, or vibration. Contact us today to learn more, request a quote, or schedule a service.