AWS for Industries

Leveraging AWS in Life Sciences and Healthcare Disaster Recovery Planning

Advances in life sciences and healthcare IT have rapidly improved patient care and increased efficiency across the industry. As a result, life sciences and healthcare disaster recovery planning has become an even greater concern for medical providers, insurers, pharmaceutical companies, and other life sciences and healthcare-related organizations. They understand the importance of preventing data loss and downtime of IT systems such as electronic health records, predictive analytics, and remote monitoring. Moreover, the Health Insurance Portability and Accountability Act (HIPAA) requires healthcare organizations to develop and implement contingency plans, which include an effective IT disaster recovery strategy.

Many businesses today are concerned about downtime, but in life sciences and healthcare the stakes are very high. Outages caused by human error, network failures, ransomware, natural disasters, or other unexpected events can have a direct harmful impact on patient care. For example, the inability to access updated medical records could cause medical staff to prescribe inaccurate medications or a predictive analytics system might not get the data it needs to alert medical staff to the increased likelihood of cardiac arrest in a patient. The financial and regulatory (e.g., HIPAA Security Rule) implications of an outage can have detrimental, long-term consequences on the organization. That is why a robust, well-planned, and frequently tested disaster recovery strategy is crucial for life science and healthcare organizations.

Challenges of Traditional, On-Premises Disaster Recovery

Traditional, on-premises disaster recovery strategies are very complex and expensive undertakings. To set up and maintain both primary and disaster recovery on-premises data centers is prohibitively expensive for many life science and healthcare organizations. Not only does it require duplicate hardware, compute, and storage infrastructure, but it often requires duplicate licenses for third-party applications, including costly medical software.

In addition to budgetary limitations, life science and healthcare IT professionals may encounter technical constraints as medical technology vendors often insist that customers use only their application-specific disaster recovery solutions. Yet many IT professionals quickly discover that these application-specific solutions do not necessarily provide the technical features they are looking for (e.g., automation) or the results they need (e.g., recovery within minutes). In addition, having to use different disaster recovery solutions for different applications can make the setup and maintenance of an organization’s overall disaster recovery strategy cumbersome, resource intensive, and error prone.

Advantages of Cloud Disaster Recovery

Life sciences and healthcare IT professionals can leverage AWS to achieve robust disaster recovery for their applications at a much lower cost than traditional, on-premises disaster recovery solutions. In order to take full advantage of the cloud’s benefits, IT departments should consider using a disaster recovery solution that is fully integrated with AWS – whether that is an AWS service or a solution offered by an AWS Partner Network (APN) Partner.

Customers should consider the following critical capabilities when evaluating cloud disaster recovery technologies:

Replication into a minimal footprint target site on AWS, where applications are continuously updated and ready to be launched – but do not require duplicate compute, storage, or software licenses until a disaster recovery test or an actual recovery event. This will reduce disaster recovery total cost of ownership (TCO). Learn more about how to leverage AWS to reduce disaster recovery TCO in this white paper: Achieve enterprise-grade disaster recovery for a fraction of the cost.

 Figure 1: Sample disaster recovery architecture with continuous replication into low-cost, minimal-footprint staging area on AWS.

Universal application and infrastructure support, which is essential for heterogeneous, complex life science and healthcare IT environments composed of a wide range of applications (including legacy applications and write-intensive databases), Windows and Linux-based operating systems, and physical, virtual, and cloud infrastructure. Ideally, the solution should offer the same implementation process for all supported applications and databases. This will simplify and accelerate disaster recovery set up, implementation, and testing.

Recovery Point Objectives (RPOs) of seconds and Recovery Time Objectives (RTOs) of minutes in order to meet strict life science and healthcare industry regulations for data protection and recovery. Look for solutions that use advanced automation and continuous data protection to achieve these stringent recovery objectives.

Point-in-time recovery, which enables failover to earlier states of replicated applications, including their data and entire system state. This is particularly important in cases of ransomware attacks but can also be used to quickly recover from database corruptions or accidental system changes.

 

Figure 2: Point-in-time recovery dialog box from the CloudEndure Disaster Recovery Console.

Non-disruptive testing is critical for life science and healthcare IT environments that cannot afford to stop operations or risk losing data (such as electronic health records) as a result of periodic disaster recovery tests. Regular testing of recovery plans is a required component of industry regulations, and a best practice for maintaining recovery readiness. Solutions that use the same target servers for both replication and testing/recovery must stop replication during a test, thereby increasing the RPO until the test is complete and the source servers are back in sync. In contrast, solutions that continuously replicate data into a replication staging area that is independent from the tested or recovered servers can provide non-disruptive testing without impacting recovery objectives.

Automated failback to primary infrastructure enables organizations to quickly return to regular operations after the cause of the disaster is remedied. A disaster recovery site usually is not designed to run daily operations, and a lot of effort may be required to move data and business services back to the primary environment once the disaster is over. Organizations may need to plan for downtime or a partial disruption during the failback process. Fortunately, there are disaster recovery solutions that simplify failback to the primary location after a disaster is over and the primary environment is operational.

CloudEndure Disaster Recovery

AWS offers CloudEndure Disaster Recovery, a robust, HIPAA-eligible solution for healthcare organizations interested in leveraging AWS for enterprise-grade, cost-effective disaster recovery. This service continuously replicates applications from physical, virtual, or cloud infrastructure to a low-cost staging area that is automatically provisioned in the customer’s preferred target AWS Region. During failover or testing, an up-to-date copy of applications can be spun up on demand and be fully functioning in minutes. When the disaster is over, CloudEndure Disaster Recovery provides automated failback to the source infrastructure, and spins down resources back to the minimal footprint staging area.

On-Premises to AWS or Cross-Region Disaster Recovery

Life science and healthcare organizations can use AWS and CloudEndure Disaster Recovery to achieve IT resilience whether or not they are running primary applications on AWS. Organizations that run primary applications on premises can use the AWS elastic infrastructure as their disaster recovery site. By replicating applications into a lightweight AWS staging area, they can significantly reduce disaster recovery TCO because during ongoing operation they pay for low-cost staging resources to store the replicated data and minimal compute to enable the replication. Payment for fully provisioned applications is only required during disaster recovery drills or recovery mode.

Organizations that have built applications directly on AWS or have migrated their applications to AWS can use CloudEndure Disaster Recovery for cross-region disaster recovery – running applications in one AWS Region and replicating them into a second AWS Region that serves as a disaster recovery site. It is considered best practice to select AWS Regions that are geographically distant in order to reduce the blast radius in case of regional issues, such as extreme weather events.

Customer Use Case: Pharmaceutical Company Meets Strict Recovery Objectives

One of the world’s largest pharmaceutical companies is using cross-region disaster recovery on AWS to protect its “crown jewel” applications, including chromatography data system (CDS) and regulatory software. Given this customer’s global reach, it runs applications on primary AWS Regions in the United States, Europe, and Asia. For each of these primary Regions, the company uses a geographically distant disaster recovery Region. For example, their primary site that runs in the Asia Pacific (Singapore) Region replicates into the EU Central 1 (Frankfurt) Region.

As part of a highly regulated industry, this customer has to meet strict RPOs and RTOs for thousands of servers. For their mission-critical applications, they aimed to achieve an RPO of seconds and an RTO of four hours – with the RTO defined as the amount of time it takes until all users have their “hands on keyboard” (i.e., they are able to fully use the applications). At first, this customer used a snapshot-based strategy – creating, replicating, and restoring snapshots on AWS Elastic Compute Cloud (EC2) across two Regions. However, they were not able to meet their recovery objectives. Moreover, implementing this method required a lot of manual work that was not scalable across hundreds or thousands of servers, especially given their slim IT team.

They then decided to use CloudEndure Disaster Recovery, which spun up servers in the defined disaster recovery Region within two to ten minutes. The customer could then quickly perform other necessary operations (e.g., DNS changes, networking changes, remote access) and get the applications back up and running well within the defined RTO of four hours. The customer was able to achieve an RPO of seconds thanks to the asynchronous, continuous replication provided by CloudEndure Disaster Recovery.

For this customer, the ability to replicate and launch a wide range of applications in an identical manner using the same tool was critical. For example, an application used for clinical data was composed of a file share component and an Oracle Database component. Because CloudEndure Disaster Recovery is application agnostic, it continually replicated both components, keeping the entire application in sync. This sync was critical for the application to work correctly and maintain clinical data. The customer reported that they were unable to do this with a snapshot-based approach.

Learn More

By leveraging the cloud in healthcare disaster recovery planning, you can meet the industry’s strict contingency plan guidelines while simplifying implementation and reducing TCO. Learn more about using AWS for disaster recovery on the CloudEndure Disaster Recovery product page.

If you are looking to learn more about AWS for Healthcare & Life Sciences, join us on the web at www.aws.amazon.com/health.