AWS HPC Blog
Running finite element analysis using Simcenter Nastran on AWS
This post was written by Dnyanesh Digraskar, Sr. Partner Solutions Architect for HPC at AWS and co-authored by Wei Zhang and Ravi Gupta, Sr Software Engineers for Simcenter Nastran at Siemens.
Introduction
In this blog, we demonstrate the deployment, performance, and price comparisons of Simcenter Nastran for three finite element analysis (FEA) based use cases on Amazon Web Services (AWS) high performance computing (HPC) clusters. Simcenter Nastran is a FEA application by Siemens, used across multiple engineering disciplines such as aerospace, automotive, electronics, and medical devices for solving problems related to linear and nonlinear structural analysis, acoustics, aeroelasticity, thermal analysis, and optimization.
FEA simulations are compute- and memory-intensive, and engineers have traditionally run these simulations using on-premises workstations. However, using on-premises resources can limit engineers in terms of available capacity and scale. By running these types of workloads on AWS, engineers can take advantage of on-demand pricing, elastic capacity, and the latest technology to help them maximize their investment in FEA codes, without the challenges of managing on-premises infrastructure.
For the purpose of this blog post, we perform FEA simulations using Simcenter Nastran on models consisting of 2–6 million Degrees of Freedom (DoF). We perform the analysis using memory-optimized (R5) and compute-optimized (C5) Amazon Elastic Compute Cloud (Amazon EC2) instances and highlight the best price-performant instance for each model. We also discuss the cluster architecture, and the recommended AWS services.
Overview: Solution architecture
Cluster configuration
Simcenter Nastran application is Message Passing Interface (MPI) enabled, and runs on an HPC system consisting of building blocks using the AWS services highlighted below. FEA analyses involve large matrix operations, and hence typically require hardware with high clock speed processors, high memory, and throughput. For that purpose, we have chosen Amazon EC2 memory-optimized R5 and compute-optimized C5 On-Demand Instances for this workload. Amazon EC2 R5d instances, with up to 32 physical cores, are based on 3.1 GHz Intel Skylake-SP or Cascade Lake processors. Amazon EC2 C5d.18xlarge instances, with 36 physical cores, are based on 3.4 GHz Intel Skylake-SP processors. These instances are powered by the AWS Nitro System, an advanced hypervisor technology, and support high memory requirements for FEA solvers, resulting in increased performance and reduced latency. Amazon EBS drive with 50-GB storage is attached to the head node for storing the application files.
Deployment of Simcenter Nastran on AWS is described in the following architecture diagram:
AWS services and HPC solution components
AWS ParallelCluster, an AWS supported open-source cluster management tool, is used to deploy and manage HPC clusters. You can specify the desired cluster components such as instance types, storage, etc. in a single text file, and can deploy an HPC cluster within 15 minutes. Version 2.10 is used for this blog. AWS Cloud9, a cloud-based integrated development environment (IDE) that lets you collaborate to write, run, and debug code in a browser, is used for securely accessing the HPC cluster via secure shell (SSH). Note, that any IDE should work for this purpose. Amazon Elastic Block Store (Amazon EBS) is an easy to use, durable, high-performance, block-storage service that can be used for FEA workloads, which are both throughput and data intensive. Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. The simulation results, to and from the HPC clusters, are stored in Amazon S3 buckets for further use or further archival to Amazon S3 Glacier.
Now that you have an overview of the HPC cluster components and the AWS services used, let us review Simcenter Nastran’s performance on Amazon EC2 instances in terms of turnaround times and incurred costs.
Analysis: Application performance and associated costs
The test cases described in this blog include 0.4–5.4 million elements, and 2.2 to 6.0 million DoF, which covers the typical range of problem sizes analyzed by Simcenter Nastran customers. Cases are run on 16 to 36 physical cores, on single- and multi-nodes of Amazon EC2 R5d and C5d On-Demand Instances, with hyper-threading disabled. Parallel processing is performed using a combination of Shared Memory Parallel (SMP) and Distributed Memory Parallel (DMP) algorithms with both in-core and out-of-core settings. SMP refers to a single machine with multiple processors sharing common memory, while DMP refers to multiple machines communicating over MPI. Performance results are summarized in the charts below. Case turnaround time (blue bars) in hours is plotted on the left vertical axis, and the simulation cost (solid orange curve) is plotted on the right vertical axis of the graphs. Instances used for running the test cases are shown on the horizontal axis of the graphs.
Test case 1 involves computing mode-sets for a block of solid elements. The model has 5.4 million elements and 6.0 million DoF, and is computed with the Lanczos method. The best performance was achieved on Amazon EC2 r5d.8xlarge instance with SMP = 16 (Figure 2).
Test case 2 involves analyzing vibro-acoustic response with structural force excitation. The model has 2 million DoF. The best performance was achieved on Amazon EC2 r5d.8xlarge instances with SMP = 4 and DMP = 8 on two nodes (Figure 3).
Test case 3 involves a non-linear static analysis for a pump model. The model has 0.4 million elements and 4 million DoF. There are two subcases: one is to calculate bolt preload; the other is the non-linear static analysis based on the bolt preload result. The best performance occurred on Amazon EC2 c5d.18xlarge instances (Figure 4).
Additional information
While the test cases described above represent the typical problem sizes run by our customers, we also performed FEA simulations on a relatively large model representing an airplane wing with 148 million elements, and 600 million DoF. We selected an Amazon EC2 high-memory instance, the x1.32xlarge with 64 physical cores and 1.9-TB memory for this workload. For this large case, Simcenter Nastran performed about 12% faster on the x1.32xlarge instance compared to that on Amazon EC2 r5d.16xlarge instance. Given the considerably higher instance costs, use the Amazon EC2 x1.32xlarge instance for running your urgent, time-sensitive simulations only.
Simulation costs highlighted in this post reflect the On-Demand Instance costs in the N. Virginia (us-east-1) Region, and you should consider license costs separately. We also performed simulations on Amazon EC2 Spot Instances, which let us take advantage of unused capacity at up to 90% less over On-Demand Instance pricing. We encourage you to run non-critical workloads on Spot Instances, and take advantage of the substantially lower price.
The following table summarizes the HPC cluster, case details, elapsed time, and incurred costs for running the test cases described in this blog:
Case Size (DoF) | Amazon EC2 Instance | Number of Nodes | Number of Physical Cores | Memory (GiB) | Instance Storage (GiB) | Elapsed Time (hrs) |
Instance cost for simulation ($) |
5,976,792 | r5d.8xlarge | 1 | 16 | 256 | 500 | 2.09 | 4.82 |
2,186,107 | r5d.8xlarge | 2 | 32 | 512 | 100 | 0.20 | 0.96 |
4,560,102 | c5d.18xlarge | 1 | 36 | 144 | 500 | 0.54 | 1.96 |
Summary
Engineers can efficiently perform Simcenter Nastran FEA simulations on Amazon EC2 instances, and take full advantage of the elasticity, scale, advanced technology, and on-demand usage of the AWS Cloud.
This post demonstrates the price and performance characteristics of an HPC cluster successfully running FEA simulations on three real-world problems using Siemens’ commercial software Simcenter Nastran. Amazon EC2 r5d.8xlarge is the best performing instance for the test cases involving modal and vibro-acoustic analyses, and Amazon EC2 c5d.18xlarge for the test case involving non-linear static analysis. AWS ParallelCluster is used as an orchestration tool of choice for deploying the HPC cluster, and submitting jobs.
If you are interested in running FEA or other HPC workloads on AWS, more information can be found on the AWS HPC solutions page. For detailed HPC-specific best practices based on the five pillars of the AWS Well-Architected Framework, download the whitepaper, High Performance Computing Lens.