[SEO Subhead]
This Guidance shows how to use AWS services to host generation interconnection simulations, such as production cost modeling, on AWS. Due to the variability and unpredictability of renewable energy sources, integrating them into the grid requires considerable analysis. While many simulation tools aid in grid planning, they often run on local servers, limiting their performance for increasingly complex simulations. By hosting simulations on the scalable and reliable AWS infrastructure, you can reduce complex-simulation run time, avoid interruptions and restarts, and meet dynamic demand to accelerate your renewable energy transition.
Please note: [Disclaimer]
Architecture Diagram
[Architecture diagram description]
Step 1
Use AWS Amplify to build a simple, full stack web application and get authenticated with Amazon Cognito. Upload and download data stored in Amazon Simple Storage Service (Amazon S3). Invoke an AWS Lambda function to preprocess input data for generation interconnection simulation, and start AWS Step Functions.
Step 2
Use Step Functions to create a workflow that submits the simulation job, automates job batching, and monitors job status.
Step 3
Configure an AWS ParallelCluster with the required software and dependencies to run generation interconnection simulations with job schedulers. Admins can interact with the high performance computing (HPC) cluster using the pcluster command line interface (CLI) and the ParallelCluster UI (from ParallelCluster version 3.5.0). NICE DCV is also included in ParallelCluster.
Step 4
Use a job scheduler with a built-in queue to optimize generation interconnection simulation tasks depending on the job attributes (such as the number of tasks or priority) and the compute environment. AWS Batch and Slurm are natively supported. Alternatives are Terascale Open-Source Resource and Queue Manager (TORQUE) and HTCondor.
Step 5
Schedulers distribute jobs across multiple nodes of a compute fleet. Amazon EC2 Auto Scaling is configured to scale compute capacity dynamically according to the number of jobs scheduled. Compute-optimized instances are recommended for the compute node (for example, Amazon EC2 C7i Instances or Amazon EC2 C7a Instances).
Step 6
Use Amazon FSx for NetApp ONTAP or Amazon FSx for OpenZFS as a high performance file system to process and store intermediate results generated by generation interconnection simulation software. Amazon S3 can be used to store the output files.
Step 7
Use AWS DataSync to move a selected portion of data from Amazon FSx to Amazon S3 for output visualization.
Step 8
Use EC2 Image Builder and predefined AWS CloudFormation templates to manage the image for the cluster head node and the compute node for continuous integration and continuous delivery (CI/CD).
Step 9
Use Amazon Simple Notification Service (Amazon SNS) and Amazon CloudWatch to monitor the cluster and notify users of simulation job status changes, such as starting and completion.
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
Amplify lets you quickly and securely set up and manage a serverless UI for the HPC cluster, and Step Functions helps you visualize and control the workflow that orchestrates job steps. CloudWatch monitors the cluster’s performance through collected metrics, helping you gain insights into the operation. And by using CloudFormation, you can use infrastructure as code to provision the environment, limiting human errors and increasing the consistency of event responses. All of these services are fully managed by AWS.
-
Security
Cognito provides frictionless customer identity and access management for the frontend and enables user pools as well as federated login and access. Federated access lets you use existing identities and permissions, and provide a uniform user experience with the same level of security as used by the rest of your company. By scoping AWS Identity and Access Management (IAM) policies according to the least-privilege principle, you can limit unauthorized access to resources.
-
Reliability
EC2 Auto Scaling equally distributes Amazon EC2 instances in multiple availability zones (AZs) to increase fault tolerance and availability. It can detect when an instance is unhealthy, terminate it, and launch an instance to replace it. Additionally, if one AZ becomes unavailable, EC2 Auto Scaling can launch instances in another AZ to compensate. Amazon FSx, which supports the HPC application’s high input/output operations per second (IOPS) and large throughput, can also be deployed to multiple AZs, providing enhanced durability by synchronously replicating data across AZs. It also enhances availability during both planned system maintenance and unplanned service disruption by failing over automatically to the standby AZ. This protects data against instance failure and AZ disruption. Finally, Amazon S3 provides persistent and reliable storage for input and output data.
-
Performance Efficiency
ParallelCluster spins up and down necessary instances to meet demand dynamically by using an EC2 Auto Scaling group, which makes sure that resources are the right size for the workload. Amazon FSx can process massive data sets with hundreds of gigabytes per second of throughput, millions of IOPS, and submillisecond latencies.
-
Cost Optimization
Step Functions and Lambda help minimize costs through their event-driven pattern: no costs are incurred when no jobs are submitted. Additionally, ParallelCluster uses an EC2 Auto Scaling group to spin out only the instances needed, avoiding resource idling and waste. ParallelCluster uses an EC2 Auto Scaling launch template to start instances for submitted jobs, and you can choose the most cost-effective instance type based on your performance benchmarking and resource utilization rate. CloudWatch monitors usage and delivers logs and insights that can help you right-size your fleet instances and operate cost-aware workloads.
-
Sustainability
Amazon S3 Intelligent-Tiering monitors access patterns and moves objects among tiers automatically, thus striking a balance between cost and energy reduction and access efficiency. EC2 Auto Scaling helps you dynamically scale the compute fleet of the HPC cluster to avoid resource idling, resulting in a more efficient and sustainable solution. Additionally, Step Functions and Lambda only operate in response to job submissions and don’t run during the HPC cluster’s idle time, thereby reducing the required resources and decreasing the environmental impact of your workloads.
Implementation Resources
A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.