Customer Stories / Energy / Americas

2024
Arm Ltd logo 72LG

Arm Scales Performance for Chip Design Using Amazon FSx for NetApp ONTAP

Arm Limited streamlined its chip-design workloads by adopting Amazon FSx for NetApp ONTAP, boosting scalability while maintaining consistent performance.

50%

reduction in processing time

10 million +

daily job submissions

470k +

CPU cores (all on SPOT Instance)

Benefiting

from scale-out capabilities

Overview

Chip design is a complex process, involving high performance computing workloads to run simulations at scale. Semiconductor and software design company Arm Limited (Arm) is no exception, heavily relying on its production-level electronic-design-automation (EDA) workloads to design chips. Although its custom in-house EDA facilitates the design and simulation of semiconductors, Arm faced challenges in scalability and efficiency while running EDA on premises. So it turned to Amazon Web Services (AWS) to modernize its infrastructure.

To scale compute and storage capabilities for its high-demand EDA workloads, Arm adopted Amazon FSx for NetApp ONTAP, a service that provides fully managed shared storage on AWS with the popular data access and management capabilities of NetApp ONTAP. This strategic move empowered Arm to boost its storage and compute scalability, which accelerated processing and improved scalability across its design processes.

Opportunity | Using Amazon FSx for NetApp ONTAP to Scale Compute and Storage Resources for Arm

Founded in 1990, Arm is a semiconductor and software design company that designs energy-efficient CPU and GPU processors and system-on-a-chip infrastructure and software. EDA is a crucial component of Arm’s chip-design process; this application encompasses over 50 specialized software tools for the entire lifecycle of semiconductor design.

EDA workloads involve many files and require intensive metadata operations for efficient processing. Because of this complexity, Arm required scalable compute and storage to manage and accelerate the design process.

“The major challenge with high-performance computing workloads, including EDA, is that they are spiky,” says David Miller, senior director and fellow in IT architecture at Arm. “We have times where compute is unused and times where we do not have enough compute to satisfy demand.”

Arm’s original setup included an on-premises data center with NetApp ONTAP storage systems. However, the company needed a dynamic, scalable solution that could handle the variability of its compute needs while providing the agility to scale up or down as required.

Over the years, Arm has developed a strong history on AWS, and, after engaging the AWS team, the company identified Amazon FSx for NetApp ONTAP as an ideal solution. With Arm’s on-premises environment already using NetApp ONTAP, this service aligned with its needs.

kr_quotemark

Using Amazon FSx for NetApp ONTAP, we’re successfully benefiting from the cloud to unlock a new set of scale-out storage capabilities.”  

David Miller
Senior Director and Fellow, IT Architecture, Arm Limited

Solution | Supporting Tens of Petabytes and Scaling to Reduce Completion Times by 50%

Using Amazon FSx for NetApp ONTAP, Arm now runs EDA using a hybrid cloud setup. During periods of peak demand, it bursts workloads to the cloud. This fully managed cloud service replicates the data management capabilities of Arm’s on-premises environment while benefiting from the scalability and flexibility of the cloud. Amazon FSx for NetApp ONTAP offers an ideal pathway to migrate, back up, or burst applications to AWS without necessitating changes to application code or data management practices.

In the hybrid setup, Amazon FSx for NetApp ONTAP is configured as an in-cloud cache for Arm’s on-premises NetApp file system using NetApp FlexCache. With NetApp FlexCache volumes, Arm can burst to the cloud using  Amazon Elastic Compute Cloud (Amazon EC2)—which offers broad and deep compute infrastructure—to provide low-latency access to tens of petabytes of data and tens of billions of files. Arm adopts Amazon FSx for NetApp ONTAP as a cloud-based cache for its on-premises data so that cloud computing workloads can efficiently access data stored on AWS. This setup enhances performance by removing the need for each workload to access on-premises data directly.

To maintain a consistent working experience, Arm also established a cloud foundation platform on AWS, where each workload is separated into its own dedicated environment. This environment supports lightweight directory access protocol, which is used for directory services authentication, and NFSv3, which facilitates file sharing. This setup mirrors the on-premises experience, minimizing disruptions to users’ EDA workloads and effectively replicating Arm’s existing operations in the cloud.

The transition to AWS provided multiple benefits to Arm. Since implementing a modern, cloud-based architecture with Amazon FSx for NetApp ONTAP, Arm has processed 10 million jobs. The company has also enhanced its EDA workflows by using AWS for burst-compute capacity. With this hybrid approach, Arm can complement its on-premises compute with highly scalable cloud resources. These capabilities—combined with the adoption of services such as AWS Batch, a fully managed batch computing service—have accelerated the chip-design process and, thus, its speed to market.

The fully managed nature of the solution significantly reduces operational overhead for Arm’s development and engineering teams. Before Amazon FSx for NetApp ONTAP introduced scale-out features, Arm relied on a single pair of high-availability file servers and horizontal scaling to manage its workload demands. This approach was sufficient for smaller projects, which required one cluster, but scaling larger and more compute-intensive ONTAP workloads was complex and time consuming.

Using scale-out Amazon FSx for NetApp ONTAP, Arm has surpassed the performance of its previous deployments. This feature removes the limits on the number of compute nodes that can be used, so Arm no longer needs to search for compute capacity to support its storage needs. “Some of our workload processing times improved by more than 50 percent,” says Miller. “Amazon FSx for NetApp ONTAP is a game changer for us, and we intend to keep pushing its boundaries.”

Architecture Diagram

Click to enlarge for fullscreen viewing.  

Read the Optimizing EDA and Semiconductor Workloads best practices white paper.

Outcome | Optimizing Future EDA Workloads in the Cloud

Using Amazon FSx for NetApp ONTAP, Arm can support spiky compute demands by using the cloud for additional resources when needed while maintaining core systems on premises for consistent performance and availability. To further enhance the efficiency and scalability of its chip-design process, Arm will continue to explore new capabilities in cloud computing.

“Keeping pace with the rapid rate of cloud innovation requires agile, high-performing, and secure solutions,” says Miller. “Using Amazon FSx for NetApp ONTAP, we’re successfully benefiting from the cloud to unlock a new set of scale-out storage capabilities and higher performance for our internal EDA workloads.”

About Arm Ltd.

Founded in 1990, Arm Limited is a semiconductor and software design company that is based in the United Kingdom. It designs energy-efficient CPU and GPU processors and system-on-a-chip infrastructure and software.

AWS Services Used

Amazon FSx for NetApp ONTAP

Amazon FSx for NetApp ONTAP provides fully managed shared storage in the AWS Cloud with the popular data access and management capabilities of ONTAP. 

Learn more »

Amazon EC2

Amazon Elastic Compute Cloud (Amazon EC2) offers the broadest and deepest compute platform, with over 750 instances and choice of the latest processor, storage, networking, operating system, and purchase model to help you best match the needs of your workload.

Learn more »

Amazon Batch

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. AWS Batch dynamically provisions the optimal quantity and type of compute resources (e.g., CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs submitted.

Learn more »

More Electronics Customer Stories

no items found 

1

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.