AWS Machine Learning Blog

AWS and NVIDIA to bring Arm-based Graviton2 instances with GPUs to the cloud

AWS continues to innovate on behalf of our customers. We’re working with NVIDIA to bring an Arm processor-based, NVIDIA GPU accelerated Amazon Elastic Compute Cloud (Amazon EC2) instance to the cloud in the second half of 2021. This instance will feature the Arm-based AWS Graviton2 processor, which was built from the ground up by AWS and optimized for how customers run their workloads in the cloud, eliminating a lot of unneeded components that otherwise might go into a general-purpose processor.

AWS innovation with Graviton2 processors

AWS has continued to pioneer cloud computing for our customers. In 2018, AWS was the first major cloud provider to offer Arm-based instances in the cloud with EC2 A1 instances powered by AWS Graviton processors. These instances are built around Arm cores and make extensive use of AWS custom-built silicon. They’re a great fit for scale-out workloads in which you can share the load across a group of smaller instances.

In 2020, AWS released AWS-designed, Arm-based Graviton2 processors, delivering a major leap in performance and capabilities over first-generation AWS Graviton processors. These processors power EC2 general purpose (M6g, M6gd, T4g), compute-optimized (C6g, C6gd, C6gn), and memory-optimized (R6g, R6gd, X2gd) instances, and provide up to 40% better price performance over comparable current generation x86-based instances for a wide variety of workloads. AWS Graviton2 processors deliver seven times more performance, four times more compute cores, five times faster memory, and caches twice as large over first-generation AWS Graviton processors.

Customers including Domo, Formula One,, Intuit, LexisNexis Risk Solutions, Nielsen, NextRoll, Redbox, SmugMug, Snap, and Twitter have seen significant performance gains and reduced costs from running AWS Graviton2-based instances in production. AWS Graviton2 processors, based on the 64-bit Arm architecture, are supported by popular Linux operating systems, including Amazon Linux 2, Red Hat, SUSE, and Ubuntu. Many popular applications and services from AWS and ISVs also support AWS Graviton2-based instances. Arm developers can use these instances to build applications natively in the cloud, thereby eliminating the need for emulation and cross-compilation, which are error-prone and time-consuming. Adding NVIDIA GPUs accelerates Graviton2-based instances for diverse cloud workloads, including gaming and other Arm-based workloads like machine learning (ML) inference.

Easily move Android games to the cloud

According to research from App Annie, mobile gaming is now the most popular form of gaming and has overtaken console, PC, and Mac. Additional research from App Annie has shown that up to 10% of all time spent on mobile devices is with games, and game developers need to support and optimize their games for the diverse set of mobile devices being used today and in the future. By leveraging the cloud, game developers can provide a uniform experience across the spectrum of mobile devices and extend battery life due to lower compute and power demands on the mobile device. The AWS Graviton2 instance with NVIDIA GPU acceleration enables game developers to run Android games natively, encode the rendered graphics, and stream the game over networks to a mobile device, all without needing to run emulation software on x86 CPU-based infrastructure.

Cost-effective, GPU-based machine learning inference

In addition to mobile gaming, customers running machine learning models in production are continuously looking for ways to lower costs as ML inference can represent up to 90% of the overall infrastructure spend for running these applications at scale. With this new offering, customers will be able to take advantage of the price/performance benefits of Graviton2 to deploy GPU accelerated deep learning models at a significantly lower cost vs. x86-based instances with GPU acceleration.

AWS and NVIDIA: A long history of collaboration

AWS and NVIDIA have collaborated for over 10 years to continually deliver powerful, cost-effective, and flexible GPU-based solutions to customers including the latest EC2 G4 instances with NVIDIA T4 GPUs launched in 2019 and EC2 P4d instances with NVIDIA A100 GPUs launched in 2020. EC2 P4d instances are deployed in hyperscale clusters called EC2 UltraClusters that are comprised of the highest performance compute, networking, and storage in the cloud. EC2 UltraClusters support 400 Gbps instance networking, Elastic Fabric Adapter (EFA), and NVIDIA GPUDirect RDMA technology to help rapidly train ML models using scale-out and distributed techniques.

In addition to being first in the cloud to offer GPU accelerated instances and first in the cloud to offer NVIDIA V100 GPUs, we’re now working together with NVIDIA to offer new EC2 instances that combine an Arm-based processor with a GPU accelerator in the second half of 2021. To learn more about how AWS and NVIDIA work together to bring innovative technology to customers, visit AWS at NVIDIA GTC 21.

About the Author

Geoff Murase is a Senior Product Marketing Manager for AWS EC2 accelerated computing instances, helping customers meet their compute needs by providing access to hardware-based compute accelerators such as Graphics Processing Units (GPUs) or Field Programmable Gate Arrays (FPGAs). In his spare time, he enjoys playing basketball and biking with his family.