Attain recovery point objectives with rapid failover and failback for VMware workloads
This Guidance illustrates a disaster recovery approach for using VMware Cloud Disaster Recovery (VCDR) on AWS. A virtual appliance protects machines using a combination of VMware snapshot technology, including an orchestrator that manages the disaster recovery plan within AWS and a VMware Scale-Out File System that stores replicas of the virtual machines.
In a disaster event, VCDR can recover protected virtual machines, especially if a Pilot Light mode is pre-established. The Pilot Light mode maintains a scaled-down warm standby environment, enabling faster recovery of critical applications. When the primary site is restored, VCDR orchestrates efficient failback, minimizing replication time.
Note: [Disclaimer]
Architecture Diagram

[Architecture diagram description]
Step 1
The disaster recovery as a service (DRaaS) Connector is a virtual appliance that helps you to protect VMware virtual machines through VMware APIs for Data Protection (VADP) snapshots. VADP uses the snapshot capabilities of VMware vSphere to enable backup without requiring downtime for virtual machines.
Step 2
The software as a service (SaaS) Orchestrator provides a disaster recovery orchestration service that operates within the AWS environment. This SaaS Orchestrator periodically evaluates the health and compliance of your disaster recovery plan so that your plan will function as intended when required.
Step 3
The VMware virtual machines are replicated using hypervisor snapshots and stored within the VMware Cloud Disaster Recovery (VCDR) Scale Out File System.
Step 4
In the event of a disaster, the protected VMware virtual machines will be recovered using the out-of-the-box orchestration capabilities of VCDR. If a VMware Cloud (VMC) on AWS Pilot Light environment is already established, the anticipated Recovery Time Objective (RTO) will be approximately minutes.
Alternatively, the following tasks are required:
a. Deploy a VMC Software-Defined Data Center (SDDC) to be used as the recovery site.
b. Configure the minimal necessary user access and networking permissions.
Step 5
The Pilot Light mode enables the deployment of a smaller subset of SDDC hosts (minimum of 2 nodes) ahead of time, facilitating the recovery of critical applications with lower RTO requirements.
The Pilot Light option allows you to reduce the total cost of the cloud infrastructure by maintaining a scaled-down version of a fully functional environment. This is done in a warm-standby state for core applications to remain readily available when a disaster event is triggered.
Step 6
When the original protected site is restored to an operational state, VCDR orchestrates the failback process, enabling the replication of only the delta changes to reduce the failback time. The VMC on AWS cluster can be scaled down to a 2-node Pilot Light configuration or even eliminated entirely, depending on the RTO requirements.
Get Started

Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
This Guidance combines compute, network, and storage capabilities to reduce the operational overhead associated with maintaining on-premises VMware clusters. It replaces the undifferentiated heavy lifting of manually configuring, managing, and maintaining clusters with a packaged approach that includes support and maintenance from both VMware and AWS. VMware delivers scheduled Software-Defined Data Center (SDDC) updates and emergency software patches with notifications, as well as auto-remediation of hardware failures.
-
Security
This Guidance lets you deploy a Recovery SDDC in the VMware cloud (or add an existing SDDC for recovery) to use for recovery and testing of your recovery plans and for ransomware recovery. You can add hosts, clusters, new networks, request public IP addresses, configure NAT rules, and also delete the recovery SDDC. In the event of a disaster or ransomware attack, you can recover virtual machines from your protected site to your recovery SDDC. You can recover from the disaster when your production site is ready. For ransomware, you can repair the infected virtual machines in the isolated recovery SDDC environment and restore clean and validated virtual machines back to a production site.
-
Reliability
This Guidance outlines the use of a dedicated, single-tenant cloud infrastructure supporting multiple VMware SDDCs. In this configuration, computing resources are exclusively allocated to a single customer or tenant, with up to 16 hosts for each cluster. These SDDCs are delivered using the latest high-performance computing and storage resources, optimized for high I/O workloads and featuring low-latency Non-Volatile Memory Express (NVMe)-based Solid-State Drives (SSDs).
-
Performance Efficiency
With this Guidance, you have the ability to democratize advanced technologies through the management of the SDDC. This includes patch management and secure operations of the software stack, helping you and your team focus on your application layer rather than the software and underlying infrastructure.
-
Cost Optimization
This Guidance includes flexible storage options, such as a VMware Virtual SAN (vSAN) storage approach built on Non-Volatile Memory Express (NVMe) instance storage, allowing you to manage storage costs for your application. It also provides advanced data services like quality of service, snapshots, and third-party data protection, optimizing storage utilization and costs. Additionally, it supports custom-sized virtual machines and VMware-compatible operating systems. Lastly, the single-tenant bare metal AWS infrastructure can provide cost advantages over shared virtual resources by delivering dedicated computing and storage.
-
Sustainability
The AWS data centers that host the services in this Guidance are designed to offer a lower carbon footprint compared to traditional, on-premises data centers. In addition, this Guidance allows you to deploy a fully configured VMware SDDC cluster in under a few hours. Lastly, you can scale host capacity up and down in minutes, effectively minimizing the environmental impact of your workloads by dynamically adapting capacity based on demand.
Related Content

[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.