AWS Partner Network (APN) Blog

Supercharge Cadence Voltus Power Integrity with AWS ParallelCluster

By Jinal Apte, Product Engineer – Cadence Design Systems
Rohit Somwanshi, Product Engineer – Cadence Design Systems
Pedro Gil, Sr. Solutions Architect – AWS

Cadence Voltus IC Power Integrity Solution is a state of the art power integrity tool designed to address the complex challenges of modern electronic design. By providing comprehensive power integrity (PI) analysis and optimization capabilities, the Voltus solution helps engineers design reliable power delivery networks that minimize power supply noise, enabling the design of high-performance and low-power electronic systems. Power Integrity analysis is compute-intensive due to the power network’s large size and coupled nature. It is common to see power networks with tens of billions of nodes, with some even exceeding hundreds of billions of nodes. Due to this large size and very demanding chip design timelines, PI analysis has very demanding performance and memory requirements.

Amazon Web Services (AWS) provides many options when it comes to High Performance Computing (HPC) workloads. One of them is AWS ParallelCluster, a cluster management tool that automatically sets up the required compute resources, scheduler, and shared filesystem. With AWS ParallelCluster, you can quickly build and deploy production HPC compute environments.

In this blog, we discuss details of an internal benchmarking exercise using the Voltus Power Integrity solution on AWS ParallelCluster. This will provide readers with a reference setup well suited for running Voltus workloads and demonstrate the cost/performance benefit of running it on AWS.

Voltus Solution on AWS

Voltus architecture provides a highly distributed framework that is not only scalable but also fault-tolerant, making it an ideal choice for deployment in cloud environments. Its design emphasizes decentralization, enabling seamless scalability to accommodate varying workloads while maintaining robust fault tolerance mechanisms that ensure uninterrupted operations. This combination of features makes the Voltus solution exceptionally well-suited for leveraging AWS ParallelCluster infrastructure, providing the flexibility and resilience needed to meet the computing needs of modern chip design.

Figure 1: Voltus High Level Architecture

AWS ParallelCluster is an open-source cluster management tool that makes deploying and managing HPC clusters on AWS easy. It supports multiple instance types, job submission queues, and job schedulers like AWS Batch and Slurm. AWS ParallelCluster offers cloud advantages such as elasticity and fast setup, which are available to deliver optimal performance for massive Electronic Design Automation (EDA) workloads.

Using AWS ParallelCluster allows you to launch and terminate clusters as needed. You only pay for the compute resources you use while the cluster is running, which can lead to significant cost savings compared to maintaining on-premises clusters that are always running.

Amazon FSx for OpenZFS  provides shared storage a fully managed file storage service from AWS that provides highly reliable, scalable, and performant file storage built on the open-source OpenZFS file system. Differentiating factors/features about Amazon FSx for OpenZFS:

  • It offers the familiar features and capabilities of OpenZFS file systems with the agility, scalability, and simplicity of a fully managed AWS service.
  • It provides NFS access (v3, v4.0, v4.1, v4.2) to Linux, Windows, and macOS compute instances and containers.
  • Powered by AWS Graviton processors and latest AWS disk/networking technologies, it delivers up to 1 million IOPS with latencies of hundreds of microseconds and up to 12.5 GB/s throughput

Amazon CloudWatch is a service that monitors applications, responds to performance changes, optimizes resource use, and provides insights into operational health.  Amazon CloudWatch collects and visualizes real-time logs, metrics, and event data in automated dashboards to streamline your infrastructure and application maintenance.

AWS Cost Explorer has an easy-to-use interface that lets you visualize, understand, and manage your AWS costs and usage over time. Get started quickly by creating custom reports that analyze cost and usage data. Analyze your data at a high level (for example, total costs and usage across all accounts), or dive deeper into your cost and usage data to identify trends, pinpoint cost drivers, and detect anomalies.

Figure 2: Voltus AWS ParallelCluster Reference Architecture

Get the most out of Voltus on AWS ParallelCluster with these recommended AWS instance types: x2idn.16Xlarge, r7iz.32xlarge, r6id.32xlarge, and r5dn.24xlarge which meets the required CPU and memory performance for an optimal and efficient workload run.

Performance Scalability and Cost Optimization using Cadence Voltus on AWS ParallelCluster

For this post, we performed a benchmarking exercise on a small Block-Level test case (VL5) with 1.2B nodes, and a large full-chip test case (XL2) with 10B nodes. We used the AWS x2idn.16x large compute instance based on Intel Xeon Platinum 8375C @ 2.90GHz.

Performance scalability is a critical factor to consider in both small and big designs, as it directly impacts the tool’s ability to perform within a small or large computing infrastructure. Whether designing for a compact or expansive environment, the ability of the Voltus solution to scale performance gives designers the flexibility to manage the tradeoff between productivity (speed) and cost to meet project timelines. Figure 3 demonstrates exceptional scalability on both of our large and small design test cases. Running Voltus on AWS has maximized the cost efficiency at peak performance. Due to high performance, the time required to hold the host is also significantly reduced, which helps reduce overall running costs.

Figure 3: Performance Scalability – Flat Analysis

Hierarchical Analysis

Engineers can use Voltus to perform hierarchical PI analysis with Voltus-XM (Extreme Modeling) technology. One can build xPGV models for IP blocks that precisely represent the block’s electrical parasitic and demand current. These xPGV models can save run time and memory when substituted for the fully extracted blocks in the chip-level analysis, as they have fewer number of nodes.

Figure 4: Voltus-XM Technology

In this experiment, we created xPGVs for VL5 and replaced the blocks with the xPGV models, which helped reduce the total node count from 10B to 4.3B. The hierarchical analysis flow with Voltus-XM was intended primarily to reduce the compute resource requirements for the analysis of the large design—achieving ~ a 2X reduction in overall compute resources and a 2.5x reduction in cost while improving performance by ~2.5, as demonstrated in Figure 5.

Figure 5: Performance/Cost Improvement with Hierarchical Analysis

Conclusion

The shift toward EDA on the cloud is accelerating. By leveraging AWS ParallelCluster cloud infrastructure, organizations can dynamically allocate resources to meet fluctuating demands, optimizing compute resources, which reduces the need for expensive on-premises hardware. Moreover, AWS HPC infrastructure can provide advanced computing capabilities, which help increase productivity and innovate, prototype, design, and verify complex systems on chips faster, resulting in a shorter time to market for all their products.

Cadence and AWS collaborated to offer the optimal configuration for the Voltus Solution to run on AWS. For Voltus PI workloads, AWS offers linear scalability, resulting in increased productivity and turnaround time (TAT) flexibility. AWS provides limitless computing to mutual Cadence customers without requiring a significant investment in computer hardware. Thanks to Voltus-XM hierarchical analysis, significant performance gains at greatly reduced costs are possible on AWS.

For more information, contact your local Cadence or AWS account teams.

For more information on Cadence Voltus, go to https://www.cadence.com/en_US/home/tools/digital-design-and-signoff/silicon-signoff/voltus-ic-power-integrity-solution.html

For more information on EDA workloads on AWS, go to https://aws.amazon.com/semiconductor