AWS Storage Blog

How Teradata improved system availability using Amazon EBS Multi-Attach NVMe reservations

Teradata, an AWS Partner, is a leading cloud data analytics provider with two high performance data processing and analytics solutions: VantageCloud Enterprise and VantageCloud Lake. These Teradata solutions are backed by a Massively Parallel Processing (MPP) database and a shared-nothing architecture where each node operates independently with dedicated resources such as CPU, memory, and storage.

Teradata uses Amazon Elastic Block Store (Amazon EBS), an easy-to-use and scalable block-storage service designed for Amazon Elastic Compute Cloud (Amazon EC2), for low-latency performance, cost effectiveness of high I/O performance, and small long-tail latencies—all critical capabilities for parallel databases such as Teradata VantageCloud. Additionally, Amazon EBS offers independent storage and compute lifecycles, enabling data persistence during instance disruptions or instance stops-and-restarts based on business demands. In a typical VantageCloud configuration, a node (Amazon EC2 instance) has a number of Amazon EBS volume attachments to persist data. This allows nodes to be shut down when idle to reduce compute costs and lower cost of ownership. In the event of node disruptions, Teradata can persist data by attaching the volumes to a newly created instance for improved data resiliency.

A key aspect of Teradata’s cloud platform architecture is designing for resiliency. Node disruptions may be related to AWS instance hardware, Teradata’s software, or the guest operating system. A node that is offline triggers Node Failure Recovery (NFR). Node recovery is performed by detaching the Amazon EBS volumes from the offline node and attaching the volumes to a new node. For large enterprise customers with stringent availability requirements, the standard recovery process may take too long and impact the SLAs for these customers.

In this blog, we discuss Teradata’s development of a new deployment mode called Multi-Node Clique (MNC) that reduces node recovery downtime by up to 95%. We walk through how Teradata addresses high availability requirements by leveraging NVMe reservations supported by the Amazon EBS Multi-Attach feature. You will learn about a real-world enterprise use case for leveraging Multi-Attach to improve availability and reliability for clustered workloads.

Overview of Amazon EBS Multi-Attach and NVMe reservations

You can enable Multi-Attach on an EBS Provisioned IOPS volume to concurrently attach to up to sixteen Nitro-based EC2 instances within the same Availability Zone. Multi-Attach makes it easier to achieve higher availability for applications that manage storage consistency from multiple writers. Each attached EC2 instance has full reads and writes to the shared volume. You can also leverage this feature to support failover, meaning that a healthy instance can take over operations at a time of instance disruption.

Multi-Attach on a Provisioned IOPS io2 Block Express volume supports NVMe reservations, a set of industry-standard storage fencing protocols. This prevents data inconsistency by regulating access to storage for a compute or database cluster, ensuring data consistency by only permitting one host in the cluster to write to the volume at any given time.

Teradata’s objective is to improve availability and reliability

With the previous deployment mode, a set of EBS volumes are attached to only one node. During the failover process, all the EBS volumes must be detached from the offline node and attached to the standby node. The processes for managing and processing data are then migrated to the standby node. This entire recovery process takes longer than desired and is not suitable for customers running mission critical workloads with high availability needs.

Diagram of VantageCloud deployment without using the Amazon EBS Multi-Attach feature

Figure 1: VantageCloud deployment without using the Amazon EBS Multi-Attach feature

By attaching multiple instances to an EBS volume using the EBS Multi-Attach feature, the delay resulting from the detachment and attachment of EBS volumes can be eliminated. However, a node may stop responding to health checks and become partitioned. Such a partitioned node may still be capable of performing storage I/O. When this happens, the failover process is triggered, which configures a standby node to take over the partitioned node’s processes. Meanwhile, the partitioned node cannot be terminated as it doesn’t respond on the network. This can potentially result in two nodes writing to the same disk sector on the volume causing data inconsistency. Resolving this requires manual interventions that could take an extended period of time depending on the size of impacted data, reducing service availability, and reliability. Recognizing the technical and operational challenges, Teradata developed an enhanced deployment mode, Multi-Node Clique (MNC), that safely coordinates access to storage I/O while keeping all disks attached to all nodes. When there’s a need to replace a compute instance, it can be done faster than detaching and re-attaching storage or copying data.

Reducing node recovery time with Multi-Node Clique

With MNC leveraging the Multi-Attach feature, a set of Amazon EBS io2 Block Express volumes are attached to multiple nodes. As shown in the following diagram, all volumes attached to the active nodes are also attached to the standby node.

Diagram of VantageCloud 2-node clique (Multi-Node Clique) configuration

Figure 2: VantageCloud 2-node clique (Multi-Node Clique) configuration

Using MNC, the failover process due to an offline node removes the need for detaching and attaching volumes. At the time of node disruption, the failover process migrates the processes for managing and processing data to the standby node. The existing io2 Block Express volume NVMe reservation keys, used to create associations between the instance and the volume, are revoked and replaced with new keys. The standby node then registers the new reservation keys for the volumes. The partitioned node that is not reachable on the network is prevented from registering with the new keys. Any I/O issued by that node errors out, thereby avoiding data inconsistency.

Amazon EBS Multi-Attach and NVMe reservations capabilities are key to addressing the challenges of faster node recovery and data integrity. Teradata gained the ability to register a volume with a node using an arbitrary registration key. This gives control for choosing the registration key and having the nodes in the cluster register with that key. In addition, any node can invalidate the registration of another node. This provides a reliable way for any node to revoke registration of partitioned nodes and prevent them from causing data inconsistencies.

Conclusion

Teradata’s Multi-Node Clique architecture represents a significant advancement in achieving high database availability on AWS. By harnessing the capabilities of Amazon EBS Multi-Attach and NVMe reservations, MNC provides enterprises with a reliable and highly available platform for its customers with most critical data workloads. Teradata was able to reduce downtime during failovers by 95% and eliminate data inconsistency caused by partitioned nodes. As organizations continue to prioritize data-driven decision making, solutions like Teradata’s MNC will play an increasingly vital role in ensuring the availability and integrity of data analytics systems.

Thank you for reading this post. If you have any comments or questions, we encourage you to share them in the comments section.

Megha Kumsi

Megha Kumsi

Megha Kumsi is a Sr. Technical Account Manager at AWS. She provides strategic technical guidance to help independent software vendors (ISVs) plan and build solutions using AWS best practices. In her free time, she likes gardening and traveling with her family.

Ben Lin

Ben Lin

Ben Lin is a Sr. Solutions Architect at AWS supporting generative AI independent software vendor (ISV) customers. He is passionate about leveraging disruptive technologies to drive transformative business outcomes.