Networking & Content Delivery

Using ENA Express to improve workload performance on AWS

In this blog post, we highlight how Elastic Network Adapter (ENA) Express can improve workload performance in conventional network applications, such as databases, file systems, and media encoding. We begin by demonstrating how ENA Express can significantly improve tail latency when used with in-memory databases. From there, we will explore the advantages it offers to file systems, with a special emphasis on the benefits it brings to single flow operations. Finally, we will show how ENA Express consistently delivers superior network performance, especially for encoding media.

Introduced by Peter DeSantis at re:Invent in 2022, ENA Express provides a mainstream implementation of the high-performance Scalable Reliable Datagram (SRD) network protocol to conventional network applications, with no need to install additional software. SRD is purpose-built for the data center, unlike TCP, which was designed for the Internet. TCP overreacts to packet loss and struggles to recover from variable network performance. ENA Express, however, takes advantage of the SRD packet spraying mechanism, which distributes packets from each flow across different network paths and dynamically adjusts distribution when signs of congestion are detected. By simply enabling ENA Express with an API call or a toggle in the AWS Management Console, you can experience up to a 93% reduction in P99.9 traffic flow latency and up to a 400% increase in single flow throughput. ENA Express seamlessly supports both TCP and UDP transport protocols, adding a layer of encapsulation to traffic packets before distributing them across multiple paths in the AWS network.

Since the launch of ENA Express a year ago, we have expanded support to new platforms and more instance sizes. Additionally, we have made it more accessible, so you can now enable ENA Express when launching a new instance or configure it as part of a launch template, enabling seamless Auto Scaling of backend systems that require high-performance networking.

In a previous blog post, ENA Express: Improved Network Latency and Per-Flow Performance on EC2, Jeff Barr provided an overview of ENA Express configuration and discussed the single flow throughput and latency performance benefits he observed through simulated networking tests. In this blog post, however, our focus shifts to real-world applications, where we will illustrate the tangible benefits you can expect.

Use Cases and Testing Results for ENA Express

Use Case 1: In-Memory Database SET and GET Optimization

ENA Express now supports Redis and Memcached in-memory database workloads, which are used for high-performance RAM-backed key-value store databases. In a typical database cluster configuration, ENA Express is configured between clients reading from and writing to a head node on a cluster of servers.

Benchmarking

To emulate this behavior at scale, we benchmarked and compared a client with 600 connections to a single server performing GET operations in parallel with SET operations. The GET to SET ratio was 4:1, we set the key size to 64 bytes, and value size was 512 bytes. The client used an ENA Express supported r6i.8xlarge instance and the server head node was an m7g.16xlarge, as shown in the following diagram, Figure 1:

ENA Express In-Memory Database Latency Testing Environment

Figure 1: ENA Express In-Memory Database Latency Testing Environment

We split-test with ENA Express enabled and disabled. The test ran at a rate of 1 million queries per second (QPS) for eight hours, and we took latency profiles to measure latency. Looking at the SET operation results in Figure 2; the X-axis represents latency percentile and the Y-axis shows the actual latency measured on a logarithmic scale.

Graph of In-Memory Database SET Latency

Figure 2: In-Memory Database SET Latency

The results show that the latency performance diverges at P99.999 and higher tail latencies. In networking, tail latency refers to the occasional or rare instances where data packets experience significantly higher latency or delays compared to the average or typical latency. Tail latency is a measure of the worst-case or extreme latency that can occur in a network, and it is particularly important in real-time or latency-sensitive applications. Examples include Financial Trading Systems, Gaming and Virtual Reality Applications, Ad-Tech Platforms, and E-commerce Checkout Systems. These applications require consistent and low-latency responses even in high-demand or peak usage scenarios. With demanding workloads like this, ENA Express is able to perform under high load.

With TCP testing, we see latency spike to 200 milliseconds at five 9’s and almost 500 milliseconds at P100. With ENA Express, the application maintains a consistent latency under 10 milliseconds all the way up to P100, which is a 60x performance improvement.

In the following chart (figure 3), we examine latency percentile and actual latency for GET operations. In this test, GET represents 80% of the operations, so a higher percentage of the load. Testing results show divergence above the P99.9 range. Not only does the latency profile get better with fewer 9’s, the total delta also grows. At P100, ENA Express tail-latency shows more than a 100x performance improvement over instances without ENA Express enabled.

Graph of In-Memory Database GET Latency

Figure 3: In-Memory Database GET Latency

With one-million QPS over eight hours, ENA Express’s differentiated tail-latency performance becomes consistently apparent. ENA Express transparently manages traffic pacing and recovery of lost packets for faster retransmit, thereby preventing TCP from getting overloaded, backing off, and spiking tail latencies as you see with the non-ENA Express test.

In these benchmarks for SET and GET, the tests were relatively short, only lasting eight hours. ENA Express works better over long periods of time, when network performance can be more variable. This is particularly useful for long lived databases like SAP HANA, which recently launched support for ENA Express.

Use Case 2: File System Access Optimization with ENA Express

ENA Express extends beyond its tail latency performance benefits and improves single flow bandwidth for workloads like file systems. OpenZFS is an open-source, high-performance file system. One or many clients can connect to a head node file server, which will store an in-memory cache of files. The head node can retrieve files using disk I/O, although the most frequently accessed files are cached in memory.

Benchmarking

To mimic the client to file server head node connectivity, we used a single client-server pair for all tests and performed reads against the in-memory cache. This is shown in the diagram that follows (figure 4). We established benchmarks for both Windows and Linux clients, with and without ENA Express.

Figure 4: ENA Express File System Single Flow Bandwidth Testing Environment

With OpenZFS, you are limited to a single TCP connection per NFS mount. Therefore, Windows clients are limited by their single flow bandwidth. In networking, “single flow bandwidth” refers to the maximum data transfer rate or capacity that can be achieved by a single data flow or communication session between two network endpoints. It represents the amount of data that can be transmitted over a network link in a single, continuous stream. When the I/O rate is high enough, OpenZFS for Windows becomes network bottlenecked due to the throughput limitations of this single TCP connection. With ENA Express, we can provide up to a 5x increase on throughput versus a standard TCP connection as shown in the following chart, Figure 5.

Graph of File System Throughput (Gbps)

Figure 5: File System Throughput (Gbps)

With OpenZFS for Linux based clients, older kernels only support a single file mount. As a result, we see similar results as the Windows use case. Starting with Linux version 5.3, NFS supports nconnect, which enables up to 16 concurrent TCP connections for a single NFS mount. With 16 concurrent TCP connections, a Linux client can theoretically achieve (16 x 5 Gbps) 80 Gbps out of the available 100 Gbps of network bandwidth, which an EC2 C6gn instances can deliver. While the instance can use 100 Gbps of bandwidth, nconnect on standard TCP is still limited to 80 Gbps. With ENA Express, you can achieve up to 100 Gbps of throughput, which is a 25% improvement because of the higher per flow capabilities. We have also shown a 5x throughput on a single connection and up to 1.7x throughput with nconnect. With nconnect, we were able to reach 95.5 Gbps with ENA Express and 64 Gbps with ENA.

Use Case 3: Live Video Encoding Optimization

Before today, the UDP protocol’s lack of reliability and TCP’s retransmit rates have limited you from using ENA with encoding of High-Definition (HD) 3 Gbps flows. Current alternatives, namely the AWS Cloud Digital Interface (CDI) plugin with Elemental built on EFA with SRD or protocols for compressed transport stream flows, such as RIST, RTP, and SRT, bring their own set of limitations. Both approaches require some amount of reconfiguration to an application if it was not originally designed for these protocols. With live video encoding in unmodified applications, we can now see consistent performance during data transfer between two instances using ENA Express.

Benchmarking

To evaluate live video encoding performance, we studied uncompressed encoding in ENA and ENA Express for HD and 4k flows between two instances as shown in figure 6. As a baseline, we streamed video at 60 frames per second (FPS). This means 1 frame must be delivered every 16.7 (1/60) milliseconds. If a frame is delivered more than 16.7ms after the previous frame, the application will result in video buffering.

ENA Express File System Media Encoding Testing Environment

Figure 6: ENA Express File System Media Encoding Testing Environment

To study ENA Express versus traditional UDP, we first needed to baseline UDP performance. When testing HD flows between two instances using ENA with UDP, the P50 and P100 latencies in figure 7 that follows show a significant number delayed packets (> 16.7ms), demonstrating the inability of the UDP protocol in meeting standard video encoding requirements.

Graph of Figure 7: ENA with UDP Video Encoding Performance

Figure 7: ENA with UDP Video Encoding Performance

When we run the same live video encoding workload on ENA Express with UDP, we immediately serve 4k flows, which are 4x larger than HD flows. As shown in figure 8, all packets were received in less than 16.7ms, avoiding any risk of buffering. This shows ENA Express’s ability to meet the performance requirements of live video over extended periods of time.

Graph of Figure 8: ENA Express with UDP Video Encoding Performance

Figure 8: ENA Express with UDP Video Encoding Performance

Conclusion

We have explored how ENA Express can elevate the performance of traditional network applications like databases, file systems, and media encoding. ENA Express continues to become more widely available, now supporting Windows and Linux with multiple instance types and sizes. It delivers the dual benefits of performance and flexibility, allowing you to unlock new potential in your real-world use cases. ENA Express is great for in-cast scenarios, particularly where tail latency is key, it unlocks higher single flow throughput for bottlenecked file systems, and it works smartly within the AWS network to manage around network congestion scenarios.

Now that you have gained insights into its capabilities, we encourage you to take the next step and put your knowledge to practical use. By following the steps outlined in the Getting Started section, you can experience firsthand the tangible benefits ENA Express offers. Whether it is reducing tail latency, enhancing single flow performance, or achieving superior network results, the Getting Started section will guide you through the process and help you harness the full potential of ENA Express in your network applications.

Getting Started

ENA Express can be enabled when launching an instance or while configuring a network interface. It works transparently with no need to install additional software installation on your instances and integrates with your TCP and UDP transport protocols without any additional modifications. While it works seamlessly, the encapsulated TCP and UDP protocols can be optimized and fine-tuned to increase performance. To simplify these tuning parameters, we have developed a GitHub guide to check and configure the optimal transport settings for ENA Express.

Running the guide’s script at instance boot up will allow you to configure the right TCP or UDP settings for ENA Express out of the box. Of course, these configurations can be embedded into your respective AMIs to ensure this works at scale. This script will set and validate that transport parameters are tuned properly, like the MTU and BQL. With this, you can seamlessly enable ENA Express in your fleet.

A correction was made on December 13, 2023. An earlier version of this post included a diagram showing Amazon FSx. The diagram has been updated to clarify that the use case applies to OpenZFS.

About the authors

John Pangle

John Pangle

John Pangle is a Senior Product Manager in the EC2 core team at Amazon Web Services. John focuses on instance networking, solving problems and building solutions to improve the instance to instance experience. In his free time, he enjoys catching a sports game, working on his golf game, trying new restaurants, exploring the outdoors, and spending time with his friends and family.

Ori Golan

Ori Golan

Ori is a Software Development Manager in the Annapurna Labs team at Amazon Web Services. Ori focuses on building next-gen networking technologies like ENA Express and providing a better performing core network for customers. Ori started his career in chip design in the HPC space, later moving up the stack to embedded software and firmware development. In his free time, Ori likes to spend time with family and friends, read, play the guitar and watch movies.

Kyle T. Blocksom

Kyle T. Blocksom

Kyle is a Sr. Solutions Architect with AWS based in Southern California. Kyle’s passion is to bring people together and leverage technology to deliver solutions that customers love. Outside of work, he enjoys surfing, eating, wrestling with his dog, and spoiling his niece and nephew.