AWS Storage Blog

Storage options and designs for VMware Cloud on AWS

VMware Cloud on AWS is a jointly engineered solution by VMware and AWS that brings VMware’s Software-Defined Data Center (SDDC) technologies to the global AWS infrastructure.

If you have workloads with varying storage requirements, it’s important to understand the storage options available and how they could work best for different scenarios.

The service offers VMware vSphere workloads with choice and flexibility to integrate with multiple storage services. However, each service is optimized for a specific scenario and no single approach is ideal for all workloads. To choose the right service, you must first understand the storage requirements and performance profiles of your VMware vSphere workloads. With that in mind, you can plan and implement your storage with cost, availability, and performance requirements optimized for your workloads.

VMware Cloud on AWS offers four categories of storage capabilities:

In this post, I discuss the considerations, design, and benefits of each offering so you could find the solution that works best for your business.

Option 1: Direct-attached storage using VMware vSAN

If you are looking to use VMware Cloud on AWS for a seamless cloud migration or as an extension to your data center, you can leverage vSAN to build a consistent and highly performant storage architecture. The use of vSphere datastores reduces complexity and time-to-value by supporting the lift and shift of a Virtual Machine (VM) without re-work of the data layer or re-architecture of storage design.

There is also the simplicity factor in operating a storage solution, which is natively built within vSphere – built-in as part of the managed service – and readily available without additional configurations.

vSAN is delivering the core storage platform for VMware Cloud on AWS using storage-optimized Amazon EC2 bare-metal instances. Underpinned by locally attached Non-Volatile Memory Express (NVMe) flash storage, vSAN pools storage from each EC2 instance joined to the cluster into a single distributed vSphere datastore. The vSAN datastore is then logically divided into two separate entities: vsandatastore and WorkloadDatastore. This segregation is a unique feature specifically developed for this service to restrict permissions between the vsandatastore hosting the management components and WorkloadDatastore storing yourVM Disks (VMDK), configuration files, and swap files.

Figure 1 – vSAN on VMware Cloud on AWS

Figure 1: vSAN on VMware Cloud on AWS

Provisioning your compute workloads in the WorkloadDatastore is straightforward. Using the vSphere web client and the typical VM deployment methods, you simply assign a storage policy to define the storage requirement and level of protection. Storage Policy Based Management (SPBM) is about ease, and agility. The intuitive nature of this approach becomes beneficial, particularly in hybrid architectures where customers desire a common approach to managing and protecting data regardless of the backing storage.

What are the vSAN storage options?

The type of EC2 instance selected when creating a cluster ultimately decides your choice of vSan storage. There are two instances to choose from: i3.metal and i3en.metal.

The i3.metal instance is the ideal choice for general-purpose workloads with balanced storage and compute resources. It offers a total of 10.7-TiB raw storage capacity with encryption at rest, deduplication, and compression enabled by default.

The second instance type, i3en.metal, is the ideal choice for data-intensive environments with heavy capacity and high random I/O transactions. It offers a significant uplift in storage with 45.85 TiB of raw capacity. Again, encryption is enabled by default, but storage efficiencies are now applied through compression and vSAN checksum optimization.

Calculating the cluster’s useable storage capacity depends on several factors; the type of instance, the number of hosts, the storage efficiency, and the failure tolerance setting. Use the sizer and TCO tool for an accurate calculation.

The type of EC2 instance selected when creating a cluster ultimately decides your choice of vSan storage.

Traditionally with vSAN, storage is tightly coupled with the server’s hardware; eventually, due to growing capacity requirements, the only way to scale storage would be to add hosts. This could increase costs and prove inefficient in managing storage-heavy workloads.

Let’s walk through alternative options to integrate external storage and solve the linear scaling problem that causes higher TCO.

Option 2: AWS Storage services

For storing large volumes of unstructured data (for instance, home directories, repositories, backups, archives, etc.) in the most agile, scalable, and cost-efficient way, AWS Storage is often the ideal solution.

For many customers, operating a storage-heavy environment on VMware Cloud on AWS is likely to impact the compute footprint, increasing the hosts and overall TCO.

In this context, AWS services should be considered to offload the storage component from vSAN. Cost-optimization could then be realized through a combination of host reduction and utilization of low-cost cloud storage.

AWS offers customers with three types of storage options to store, access, and analyze data in the cloud. Object, file, and block storage services are available to build highly available, durable, and cost-optimized solutions. In this post, we will focus on file and object storage.

Figure 2 – AWS Storage services

Figure 2: AWS Storage services

Connectivity to AWS Storage

Your first option is to leverage the Elastic Network Interface (ENI), which is automatically deployed onto each ESXi host of the SDDC.

This is a high-bandwidth and low latency network connection between the SDDC and the Amazon Virtual Private Cloud (Amazon VPC) managed by the customer.

This connectivity proves to be the most cost-efficient path to access AWS Storage, particularly when the SDDC resides within the same Availability Zone. In this scenario, your storage traffic is exempt from network charges. In contrast, all traffic destined to AWS resources outside of the Availability Zone hosting the SDDC is billed accordingly with cross Availability Zone charges. This is per the normal billing policies of AWS.

Figure 3 – AWS ENI connectivity (1)

Figure 3: AWS ENI connectivity

Your second option is available If you wish to scale beyond the connected VPC and require connectivity from the SDDC to multiple VPCs hosting AWS Storage. In such case, you should consider VMware Transit Connect. Powered by AWS Transit Gateway, this service offers a simplified aggregation layer with performance, scalability, and resiliency in mind.

Figure 4 –VMware Transit Connect (1)

Figure 4: VMware Transit Connect

AWS file storage

If you are looking to decouple Network Attached Storage (NAS) storage from vSAN to allow for independent scaling, then AWS can provide your applications with external and scalable file storage accessible using industry-standard protocols.

AWS offers fully managed services that provide file system storage for Windows or Linux workloads. It offers tight integration to address the diverse needs of file-based workloads like home directories, storage-heavy repositories, media content, and development environments.

Within minutes, you can create, configure, and consume file systems without the burden of setting up file servers, provisioning storage, installing and applying software updates or performing backups. Instead, you acquire these services on-demand, paying for what you consume, and with reduced costs. Additionally, migrations are straightforward – with support for AWS DataSync, you can rapidly replicate data from on-premises file systems to AWS storage services.

AWS offers three file system services:

In this post, I focus on Amazon FSx for Windows File Server and Amazon EFS as storage options while using VMware Cloud on AWS.

Amazon FSx for Windows File Server is a fully managed and scalable file storage that is accessible over the industry-standard Server Message Block (SMB) protocol. It’s built on Microsoft Windows Server, delivering familiarity and a wide range of features and compatibility to your Windows-based applications, including Active Directory (AD) integration, quotas, file restores, and automated backups.

It offers multiple deployment options with support for Single-AZ and Multi-AZ architectures. You can optimize cost and performance for your workload needs with SSD and HDD storage options; and you can live scale storage and change the throughput performance to meet your workload needs. Refer to the following blog for information on accessing Amazon FSx from VMware Cloud on AWS.

Figure 5 – Amazon FSx for Windows File Server

Figure 5: Amazon FSx for Windows File Server

For Linux-based workloads, Amazon EFS provides a simple, serverless, set-and-forget, shared file system that lets you share file data without provisioning or managing storage. It can be used with AWS Cloud services and on-premises resources, and is built to scale on demand to petabytes without disrupting applications. With Amazon EFS, you can grow and shrink your file systems automatically as you add and remove files, eliminating the need to provision and manage capacity to accommodate growth.

Amazon EFS is a Regional service storing data within and across multiple Availability Zones for high availability and durability. Application access to the file system is seamless. As illustrated in Figure 6, the file system can be extended using NFSv4 and mounted within each VM.

Figure 6 – Amazon EFS

Figure 6: Amazon EFS

Amazon Simple Storage Service (Amazon S3) object storage

Amazon S3 object storage offers supreme scalability, high durability, and low cost to store any type of data in its native format with 99.999999999% (11 9’s) durability. The service offers high-availability with data replicated across a minimum of three Availability Zones within a Region and support for cross-Region replication.

Customers can use Amazon S3 in many ways. You can store and protect any amount of data for a range of workloads, like backups, archive, data lakes, enterprise applications, and big data analytics. With multiple S3 storage classes, you benefit from flexibility to reduce your overall TCO by leveraging different data access levels at corresponding costs, including the lowest cost cloud storage.

Workloads on the SDDC can connect to Amazon S3 via the internet gateway or VPC endpoint. The preferred route of a VPC endpoint Gateway is a distinct capability and a testament to the joint engineering between AWS and VMware. It offers a direct connection from your SDDC to Amazon S3 by routing traffic locally through a secure and reliable connection without traversing the internet.

Figure 7 – Amazon S3

Figure 7: Amazon S3

Option 3: Managed Services Provider (MSP) storage

If you are looking to scale your storage independently of compute while still benefiting from the same operational experience as on-premises, you may want to explore MSP storage offerings.

This is external storage, which is purchased together with your VMware Cloud on AWS service. The storage is natively integrated with the SDDC, and the MSP delivers the entire solution as a single, fully managed service. Mounted as vSphere datastores, the external storage offers familiarity to the VMware administrator who can now create VMs and consume storage using the typical deployment methods as on-premises.

This architecture is designed to provide complete flexibility to grow or shrink capacity, monthly, per TB, and without impacting the compute footprint. Notably, in storage-heavy environments, this could optimize your overall TCO through host reduction and leverage of lower and more cost-effective tiers of storage.

One distinct feature of the MSP model is in its support framework. As a single point of contact, the MSP provides first line support for the entire solution in addition to coordination and overall management between the partners. This includes escalations, troubleshooting, and arrangement of maintenance activities.

Additionally, you have the choice of value-add services, to contract on top of your VMware Cloud on AWS subscription. Examples of these include disaster recovery (DR), guest OS, VM monitoring, and professional services.

Managed Services Provider (MSP) storage support

Figure 8: MSP storage support

MSP Partners

Currently, the two AWS Partners that offer MSP storage are Rackspace and Faction.

Although both partners have their distinct capabilities, the design principle remains intact. The partners provision storage from co-location data centers geo-adjacent to AWS and extended to the NSX-T implementation of the SDDC using a high-throughput low latency network. Consisting of multiple 10 Gbps AWS Direct Connect links, the connection leverages private virtual interfaces, VLANs, and Border Gateway Protocol (BGP) peering to ensure end-to-end traffic isolation per tenant.

Once connectivity is established, the managed storage is mounted as three NFS datastores visible to all hosts within the SDDC. The MSP deploys the network extension and datastore mounts as part of the onboarding process.

The MSP storage does not replace vSAN. For example, the management appliances of the SDDC (vCenter and NSX) will still be deployed on the vSAN tier (vsanDatastore). In contrast, customer can provision workloads on vSAN (WorkloadDatastore), or distributed across single or multiple tiers of NFS datastores.

Managed Services Provider (MSP) storage support

Figure 9: MSP storage connectivity

Option 4: AWS Partner Network (APN) storage partners

There are several reasons why you may choose to integrate AWS Storage Partner solutions with VMware Cloud on AWS. This option is commonly deployed to bring familiarity with existing storage management solutions, either as part of a new cloud implementation or an extension to on-premises storage.

By emulating on-premises storage in the AWS Cloud, customers benefit from a consistent experience to manage storage, using familiar administrative techniques and software features. Established procedures around provisioning, monitoring, and protection can remain consistent. This is whilst still benefiting from existing enterprise functionality like read/write snapshots, cloning, replication, and tiering features that users have grown accustom to.

Aside from this, enterprise APN Storage Partner solutions also offer the capability to present the same storage protocols used with traditional applications, including iSCSI, SMB, and NFS. The storage layout and protocol presentation can remain similar to on-premises.

Products and storage solutions provided by the global community of AWS Partner Network (APN) underpin this option. You can find products from APN Storage Partners on the AWS Marketplace, an online catalogue that helps you find, buy, and deploy pre-configured third-party storage solutions.

Once selected, partners often deploy the majority of their storage products as an Amazon Machine Image (AMI) in your AWS account. In this context, an AMI is a template of EC2 compute with all the necessary software code and configuration to function as a virtual storage array. Partners build these storage products on highly durable and scalable AWS Storage services like Amazon EBS and Amazon S3.

As Figure 10 illustrates, once deployed within your Amazon VPC, the provisioned storage is presented and mounted within the VM’s Guest-OS using file-level (SMB, NFS) or block-level (iSCSI) protocols­­­.

AWS Partner Network (APN) storage partners

Figure 10: APN Storage

Summary

VMware Cloud on AWS has flexibility and support for several storage options using built-in SDDC capabilities and external storage from AWS, MSP, and APN services. You can use and combine these solutions to provide performance, simplicity, and cost optimization to your storage requirements.

It is vital for customers to understand the considerations and design of each option to build a solution that avoids excessive costs and turns storage from an expense into a strategic asset.

Please connect with us at AWS for support in implementing any of these architectures within your environments.