Disaster recovery compliance in the cloud, part 2: A structured approach
August 21, 2023: This post has been updated in recognition of the announcement of the second Canadian Region to be opened in Calgary in late 2023 / early 2024.
Compliance in the cloud is fraught with myths and misconceptions. This is particularly true when it comes to something as broad as disaster recovery (DR) compliance where the requirements are rarely prescriptive and often based on legacy risk-mitigation techniques that don’t account for the exceptional resilience of modern cloud-based architectures. For regulated entities subject to principles-based supervision such as many financial institutions (FIs), the responsibility lies with the FI to determine what’s necessary to adequately recover from a disaster event. Without clear instructions, FIs are susceptible to making incorrect assumptions regarding their compliance requirements for DR.
In Part 1 of this two-part series, I provided some examples of common misconceptions FIs have about compliance requirements for disaster recovery in the cloud. In Part 2, I outline five steps you can take to avoid these misconceptions when architecting DR-compliant workloads for deployment on Amazon Web Services (AWS).
1. Identify workloads planned for deployment
It’s common for FIs to have a portfolio of workloads they are considering deploying to the cloud and often want to know that they can be compliant across the board. But compliance isn’t a one-size-fits-all domain—it’s based on the characteristics of each workload. For example, does the workload contain personally identifiable information (PII)? Will it be used to store, process, or transmit credit card information? Compliance is dependent on the answers to questions such as these and must be assessed on a case-by-case basis. Therefore, the first step in architecting for compliance is to identify the specific workloads you plan to deploy to the cloud. This way, you can assess the requirements of these specific workloads and not be distracted by aspects of compliance that might not be relevant.
2. Define the workload’s resiliency requirements
Resiliency is the ability of a workload to recover from infrastructure or service disruptions. DR is an important part of your resiliency strategy and concerns how your workload responds to a disaster event. DR strategies on AWS range from simple, low cost options such as backup and restore, to more complex options such as multi-site active-active, as shown in Figure 1.
For more information, I encourage you to read Seth Eliot’s blog series on DR Architecture on AWS as well as the AWS whitepaper Disaster Recovery of Workloads on AWS: Recovery in the Cloud.
The DR strategy you choose for a particular workload is dependent on your organization’s requirements for avoiding loss of data—known as the recovery point objective (RPO)—and reducing downtime where the workload isn’t available —known as the recovery time objective (RTO). RPO and RTO are key factors for determining the minimum architectural specifications necessary to meet the workload’s resiliency requirements. For example, can the workload’s RPO and RTO be achieved using a multi-AZ architecture in a single AWS Region, or do the resiliency requirements necessitate deploying the workload across multiple AWS Regions? Even if your workload is not subject to explicit compliance requirements for resiliency, understanding these requirements is necessary for assessing other aspects of DR compliance, including data residency and geodiversity.
3. Confirm the workload’s data residency requirements
As I mentioned in Part 1, data residency requirements might restrict which AWS Region or Regions you can deploy your workload to. Therefore, you need to confirm whether the workload is subject to any data residency requirements within applicable laws and regulations, corporate policies, or contractual obligations.
In order to properly assess these requirements, you must review the explicit language of the requirements so as to understand the specific constraints they impose. You should also consult legal, privacy, and compliance subject-matter specialists to help you interpret these requirements based on the characteristics of the workload. For example, do the requirements specifically state that the data cannot leave the country, or can the requirement be met so long as the data can be accessed from that country? Does the requirement restrict you from storing a copy of the data in another country—for example, for backup and recovery purposes? What if the data is encrypted and can only be read using decryption keys kept within the home country? Consulting subject-matter specialists to help interpret these requirements can help you avoid making overly restrictive assumptions and imposing unnecessary constraints on the workload’s architecture.
4. Confirm the workload’s geodiversity requirements
A single Region, multiple-AZ architecture is often sufficient to meet a workload’s resiliency requirements. However, if the workload is subject to geodiversity requirements, the distance between the AZs in an AWS Region might not conform to the minimum distance between individual data centers specified by the requirements. Therefore, it’s critical to confirm whether any geodiversity requirements apply to the workload.
Like data residency, it’s important to assess the explicit language of geodiversity requirements. Are they written down in a regulation or corporate policy, or are they just a recommended practice? Can the requirements be met if the workload is deployed across three or more AZs even if the minimum distance between those AZs is less than the specified minimum distance between the primary and backup data centers? If it’s a corporate policy, does it allow for exceptions if an alternative method provides equal or greater resiliency than asynchronous replication between two geographically distant data centers? Or perhaps the corporate policy is outdated and should be revised to reflect modern risk mitigation techniques. Understanding these parameters can help you avoid unnecessary constraints as you assess architectural options for your workloads.
5. Assess architectural options to meet the workload’s requirements
Now that you understand the workload’s requirements for resiliency, data residency, and geodiversity, you can assess the architectural options that meet these requirements in the cloud.
As per AWS Well-Architected best practices, you should strive for the simplest architecture necessary to meet your requirements. This includes assessing whether the workload can be accommodated within a single AWS Region. If the workload is constrained by explicit geographic diversity requirements or has resiliency requirements that cannot be accommodated by a single AWS Region, then you might need to architect the workload for deployment across multiple AWS Regions. If the workload is also constrained by explicit data residency requirements, then it might not be possible to deploy to multiple AWS Regions. In cases such as these, you can work with our AWS Solution Architects to assess hybrid options that might meet your compliance requirements, such as using AWS Outposts, Amazon Elastic Container Service (Amazon ECS) Anywhere, or Amazon Elastic Kubernetes Service (Amazon EKS) Anywhere. Another option may be to consider a DR solution in which your on-premises infrastructure is used as a backup for a workload running on AWS. In some cases, this might be a long-term solution. In others, it might be an interim solution until certain constraints can be removed—for example, a change to corporate policy or the introduction of additional AWS Regions in a particular country such as the second Canadian Region to be opened in Calgary.
Let’s recap by summarizing some guiding principles for architecting compliant DR workloads as outlined in this two-part series:
- Avoid assumptions; confirm the facts. If it’s not written down, it’s unlikely to be considered a mandatory compliance requirement.
- Consult the experts. Legal, privacy, and compliance, as well as AWS Solution Architects, AWS security and compliance specialists, and other subject-matter specialists.
- Avoid generalities; focus on the specifics. There is no one-size-fits-all approach.
- Strive for simplicity, not zero risk. Don’t use multiple AWS Regions when one will suffice.
- Don’t get distracted by exceptions. Focus on your current requirements, not workloads you’re not yet prepared to deploy to the cloud.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.