AWS for Industries
The Business and Engineering Benefits of FSx for NetApp ONTAP for Product Engineering Workloads
In the competitive landscape of modern engineering, businesses continually seek ways to enhance productivity, streamline operations, and reduce costs. Product engineering involves complex simulations and computations, requiring robust and efficient data storage solutions for Computer Aided Engineering (CAE) applications. FSx for NetApp ONTAP, a fully managed service that provides high-performance, scalable file storage, addresses all these challenges effectively.
This blog post explores the business and engineering benefits of using FSx for NetApp ONTAP and its integration with Scale-Out Computing on AWS (SOCA) for CAE simulations.
CAE Product Lifecycle and Data Transfer Challenges
Having multi-protocol access is crucial for engineering environments where different stages of the product lifecycle might require different operating systems. Most of the time, initial design and modeling occur on Windows, while actual computational fluid dynamics (CFD) or finite element analysis (FEA) simulations are executed on High Performance Computing (HPC) cluster running on Linux distributions. FSx for NetApp ONTAP can provide seamless support for both Linux and Windows operating systems to cover the product lifecycle.
Modern CAE systems require access to a large number of CPU cores or RAM memory due to the complexity and scale of current engineering projects. These systems handle intricate models that involve millions of geometric elements, necessitating extensive computational power for accurate simulations of properties like stress, strain, or heat distribution. Additionally, parametric simulations, which involve running multiple iterations with varying parameters to optimize design performance and efficiency, significantly increase the computational load which make traditional end-user workstations not suitable for these workloads. Because of all these challenges, offloading the simulation part of the CAE product lifecycle to a separate High-Performance Computing cluster has become a necessity.
Using computational resources in the cloud, supported by fast replication capabilities like caching of data to use geographically distributed resources, is a key enabler to scale the simulation workloads seamlessly and shift the limits high-end CAE.
Figure 1: Typical scenario of a Computer Aided Engineering (CAE) lifecycle
CAE simulations generate large number of output artifacts which individual size may exceed 50 GB. Those files have to be transferred back and forth between the engineer workstation and the HPC cluster, resulting in significant wait time based on network performance and output size. Having seamless real-time access to simulation results is crucial for engineers in order to quickly validate their Computer Aided Design (CAD) models.
Figure 2: Test Procedure
In this blog post, we will compare the total duration of a standard CAE simulation pipeline using FSx for NetApp ONTAP as the multi-protocol filesystem to avoid data transfer between Windows & Linux and a traditional parallel filesystem that cannot be mounted directly on Windows.
Architecture Overview:
The following diagram displays a high-level architecture with the associated AWS services.
Figure 3. High-level architecture overview
The first step in the architecture is for the CAE engineers to access their SOCA environment via the built-in web interface. From there, they provision their Windows Virtual Desktop and prepare OpenFOAM models.
Once ready, OpenFOAM simulation submission can be done directly from the Windows Virtual Desktop or via the Web Interface portal provided by SOCA here.
The job is then queued by OpenPBS orchestrator while the number of nodes and relevant hardware requirements are automatically calculated by the SOCA automation tools. Once all job resources have been determined, a new CloudFormation stack is created and ephemeral EC2 capacity required to run the job is provisioned.
To compare total simulation time, we mount two filesystems on each HPC compute nodes:
- FSx for NetApp ONTAP that can be directly accessed by both Windows Virtual Desktop and Linux Simulation Nodes
- Standard Parallel Filesystem that can only be accessed by the Linux Simulation Nodes
Once the simulation is complete, a script will automatically copy the output artifact from the Standard Parallel Filesystem back to the Windows Virtual Desktop. The total simulation duration is then calculated via:
- Simulation Time (for FSx for NetApp ONTAP)
- Simulation Time + Transfer Data To and From (for Standard Filesystem)
- Conclusion
Based on the tests, leveraging FSx for NetApp ONTAP resulted in a time reduction of up to 32%.
For an OpenFOAM simulation using 288 CPUs and a 300M mesh models, the total duration, from job submission to results visualization, is 37 minutes using FSx for NetApp ONTAP, compared to 56 minutes for standard Linux-only filesystem. We also noticed the time reduction increased based on the complexity of the model, as these simulations tend to output larger files compared to smaller models.
In addition to massive time savings, EC2 compute resources are optimized and lead to a similar level in terms of cost reduction as you do not have to manage a separate filesystem, which enables more freedom to further optimize the design of the products.
A part of the budget can then be allocated to the implementation of disruptive methods leveraging, for example, on generative AI during 3D design phase (generative design), or for assisted model preparation (automated meshing).