AWS HPC

Elastic Fabric Adapter

Run HPC and ML applications at scale

Get started with EFA

Elastic Fabric Adapter (EFA)

Elastic Fabric Adapter (EFA) is a network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications at scale on AWS. Its custom-built operating system (OS) bypass hardware interface enhances the performance of inter-instance communications, which is critical to scaling these applications. With EFA, High Performance Computing (HPC) applications using the Message Passing Interface (MPI) and Machine Learning (ML) applications using NVIDIA Collective Communications Library (NCCL) can scale to thousands of CPUs or GPUs. As a result, you get the application performance of on-premises HPC clusters with the on-demand elasticity and flexibility of the AWS cloud.

EFA is available as an optional EC2 networking feature that you can enable on any supported EC2 instance at no additional cost. Plus, it works with the most commonly used interfaces, APIs, and libraries for inter-node communications, so you can migrate your HPC applications to AWS with little or no modifications.

Elastic Fabric Adapte

An abstract image featuring dynamic light trails in orange and white against a dark background, symbolizing speed and computational power.

Benefits

Faster results

EFA’s unique OS bypass networking mechanism provides a low-latency, low-jitter channel for inter-instance communications. This enables your tightly-coupled HPC or distributed machine learning applications to scale to thousands of cores, making your applications run faster.

Learn More

Flexible configuration

You can enable EFA support on a growing list of EC2 instances and get the flexibility to choose the right compute configuration for your workload. Simply change your cluster configurations as your needs change and enable EFA support on your new compute instances. No prior reservations or upfront planning is needed.

Learn More

Seamless migration

EFA uses libfabric interface and libfabric APIs for communications. Because almost all HPC programming models support this interface, you can migrate your existing HPC applications to the cloud with little to no modifications.

Learn More

EFA Performance

EFA provides a 4X improvement in scaling over ENA for a standard CFD simulation as shown in the chart above.

Solver for this benchmarking provided by Metacomp Technologies

AWS Customer CFD Direct

AWS Customer CFD Direct maintains the popular OpenFOAM platform for Computational Fluid Dynamics and also produces CFD Direct From the Cloud (CFDDFC), an AWS Marketplace offering that makes it easy for you to run OpenFOAM on AWS. They have been testing and benchmarking EFA and recently shared their measurements in a blog post titled OpenFOAM HPC with AWS EFA. In the post, they report on a simulations of the external aerodynamics around a car. This simulation scales extra-linearly to over 200 cores, gradually declining to linear scaling at 1000 cores (about 100K simulation cells per core).

Learn more »

How it works

Use cases

Computational Fluid Dynamics

Advances in Computational Fluid Dynamics (CFD) algorithms enable engineers to simulate increasingly complex flow phenomena, and HPC helps reduce turn-around times. With EFA, design engineers can now scale out their simulation jobs to experiment with more tunable parameters, leading to faster, more accurate results.

Learn More

Weather modeling

Complex weather models require high memory bandwidth, fast interconnects, and robust parallel file systems to deliver accurate results. The closer the grid spacing on the model, the more accurate the results—and the more computational resources the model requires. EFA offers a fast interconnect that allows weather modelling applications to take advantage of the virtually unlimited scaling capabilities of the AWS cloud and get more accurate predictions in less time.

Learn More

Machine Learning

The training of deep learning models can be significantly accelerated with distributed computing on GPUs. Leading deep learning frameworks such as Caffe,Caffe2, Chainer, MxNet, TensorFlow, and PyTorch have already integrated NCCL to take advantage of its multi-GPU collectives for across nodes communications. EFA is optimized for NCCL on AWS, improving the throughput and scalability of these training models, which leads to faster results.

Learn More

Resources

Now Available – Elastic Fabric Adapter (EFA) for Tightly-Coupled HPC Workloads

April 29th, 2019

Learn more »

AWS re:Invent 2018: Scaling HPC Applications on EC2 w/ Elastic Fabric Adapter

In this reInvent 2018 talk, we introduce Elastic Fabric Adapter and discuss how EFA enhances the inter-instance networking within Amazon EC2

Watch video

Deep Dive on OpenMPI and Elastic Fabric Adapter (EFA)

In this tech talk, we'll do a deep dive into OpenMPI and its specific support for Amazon EC2's EFA, and show you how to get the most out of your code, and architect your solution for performance.

Watch video

Getting started with Elastic Fabric Adapter (EFA)

In this tutorial, you create an EFA-enabled AMI and an EFA-enabled security group, and then launch EFA-enabled instances into a cluster placement group using that AMI and security group.

Learn more »

Elastic Fabric Adapter

Elastic Fabric Adapter (EFA)

Benefits

Faster results

Flexible configuration

Seamless migration

EFA Performance

EFA provides a 4X improvement in scaling over ENA for a standard CFD simulation as shown in the chart above.

AWS Customer CFD Direct

How it works

Use cases

Computational Fluid Dynamics

Weather modeling

Machine Learning

Resources

Now Available – Elastic Fabric Adapter (EFA) for Tightly-Coupled HPC Workloads

AWS re:Invent 2018: Scaling HPC Applications on EC2 w/ Elastic Fabric Adapter

Deep Dive on OpenMPI and Elastic Fabric Adapter (EFA)

Getting started with Elastic Fabric Adapter (EFA)

Learn

Resources

Developers

Help