Maxar Uses AWS to Deliver Forecasts 58% Faster Than Weather Supercomputer

2020

When weather threatens drilling rigs, refineries, and other energy facilities, oil and gas companies want to move fast to protect personnel and equipment. And for firms that trade commodity shares in oil, precious metals, crops, and livestock, the weather can significantly impact their buy-sell decisions. To limit damage, these companies need the earliest possible notice before a major storm strikes. That’s the challenge Maxar Technologies set out to solve.

Historically, many industries have relied on reports generated by the on-premises supercomputer operated by the National Oceanic and Atmospheric Administration (NOAA). However, the weather predictions take an average of 100 minutes to process global data. Over time, many companies began to realize they would require much faster weather warnings to protect their interests. Similar to how NASA has expanded its partnerships with private firms to acquire commercial space hardware and services, the processing and delivery of critical weather data products could also be effectively commercialized.

kr_quotemark

With the fast networking speed provided by AWS, we accomplished what many IT experts considered impossible."

Stefan Cecelski
Data Scientist, Maxar Technologies

Accelerating Forecast Delivery

To resolve this issue, Maxar sought to significantly reduce the time needed to generate numerical weather predictions. Its data scientists, engineers, and DevOps team decided to build a high performance computing (HPC) solution to deliver forecasts in half the time of the NOAA supercomputer. “We first considered an effort that would involve building the system in an on-premises data center,” says Travis Hartman, director of analytics and weather at Maxar. “But we realized we needed a cloud environment to build a cost-effective solution that our DevOps team could easily manage and which would allow us to significantly reduce our timeline to get the results to market.”

So Maxar turned to Amazon Web Services (AWS). “We knew HPC on AWS could provide an environment that balances performance, cost, and manageability,” Hartman says. “The key AWS capabilities we wanted to leverage for our numerical weather prediction application included automatic environment builds and shutdowns, elastic compute resources, the necessary networking bandwidth to crunch the numbers quickly, and the ability to do so with the velocity required by our business and customer goals.”

Cloud HPC Achieves the “Impossible”

Maxar worked with AWS to create an HPC solution that includes four key technologies. The company relies on Amazon Elastic Compute Cloud (Amazon EC2) for highly secure, resizable compute resources and the ability to configure capacity with minimal friction. Maxar also uses the Elastic Fabric Adapter (EFA) network interface to run its application with a hardware bypass interface that speeds up inter-instance communications. To complement the enhanced computing and networking, the application uses Amazon FSx for Lustre to accelerate the read/write throughput of the application. Maxar also takes advantage of AWS ParallelCluster, an open source cluster management tool that makes it easy to deploy HPC clusters with a simple text file that automatically models and provisions resources.

Initially, Maxar designed a cloud HPC cluster with 234 Amazon EC2 instances capable of producing a numerical weather prediction forecast in roughly 53 minutes, just about half the 100 minutes that the NOAA supercomputer takes to complete the same forecast. This accomplished Maxar’s initial performance goal, so the team set its eyes on enhancing the design to reduce cost.

Using EFA networking, Maxar reduced that cluster from 234 c5.18xlarge instances to just 156 c5n.18xlarge instances, which was driven by the ability of the C5n instances to communicate at 100 Gbps network speeds. The EFA interconnect made it possible to outperform the NOAA supercomputer, shortening the forecast time even further—from 53 to 42 minutes, a 22 percent decrease. The team’s new configuration can now produce a forecast 58 percent faster than NOAA’s supercomputer. Additional testing and optimization with AWS revealed Maxar could complete a forecast in under 30 minutes. With further system tuning, Maxar projects it can cut its processing time by an additional 25 percent.

“Prior to using AWS, no one thought any cloud environment was capable of outperforming an on-premises supercomputer in generating numerical weather predictions,” says Stefan Cecelski, a data scientist at Maxar. “But with the fast networking speed provided by AWS, we accomplished what many IT experts considered impossible.”

Optimizing Compute Costs to Compete against a Free Service

Having achieved its performance goal, Maxar next focused on delivering the service profitably. Maxar needed to keep the cost of its weather application as low as possible to compete with the free, yet slower, service that NOAA provides. Maxar realized this objective by reducing the number of servers and optimizing the cost of the system—without negatively impacting performance. By using AWS ParallelCluster with Amazon EC2 C5n instances and EFA, Maxar generates the same computing power while decreasing the number of clustered servers by 33 percent.

The environment automatically spins up when weather data becomes available and then quickly shuts down until a new dataset is available, using numerous AWS services to orchestrate a highly scalable, redundant, and fault-tolerant workflow. The overall cost-optimization measures applied by AWS—including the integration of Amazon EC2 C5n instances with EFA—have enabled Maxar to reduce compute cost by approximately 45 percent. “We need the AWS compute resources for only about 45 minutes each day to run our numerical weather prediction application, so it is a huge benefit to have an AWS environment that we can use only when required,” says Cecelski.

The comprehensive tools, utilities, and the overall AWS technology stack not only allowed Maxar to optimize the solution for cost and performance, but also to get to market more quickly. “In the past, it was typically cost-prohibitive for any non-government or non-academic entity to go through the procurement and investment activities to research, buy, build, configure, and then set up a traditional on–premises, bare-metal HPC environment,” says Hartman. “However, with AWS, the barrier for commercial solutions has truly been eliminated. Plus, given the experience our team has gained through setting up our cloud HPC programs and offerings, we are well-positioned to help numerical weather prediction users—and even the core authors of numerical weather prediction models like NOAA and ECMWF (European Centre for Medium-Range Weather Forecasts)—better understand and leverage commercial solutions for numerical weather prediction applications as well as other HPC needs for all areas of Earth Intelligence.”

Shaping the Future of High Performance Computing

Thanks to the success of the application, Maxar clients can now take proactive measures earlier when assets and personnel are threatened by extreme weather. “Our clients can better protect equipment and evacuate personnel sooner,” says Hartman. “And if weather threatens a commodity, our financial clients now have more time to make buy-sell decisions.”

In addition, Hartman says, “There are a number of new programs and funding vehicles being appropriated by the US government as well as international organizations that want to leverage HPC in the cloud. We believe Maxar’s experience and recent achievements should allow us to extend this technology into these same organizations.”

Cecelski concludes, “We look forward to taking advantage of new services as AWS continues to expand its offerings, shapes the future of HPC in the cloud, and helps enable us to deliver high-performing, cost-effective services to our clients.”

To learn more, visit aws.amazon.com/hpc.


About Maxar Technologies

Maxar delivers Earth Intelligence and space infrastructure and currently has more than 90 geo-communication satellites in orbit and five robotic arms on Mars. The company collects data across more than 3 million square kilometers of satellite imagery per day and has an archive of over 110 petabytes of satellite images spanning the globe.

Benefits of AWS

  • Generates weather forecasts 58% faster
  • Provides clients with more time to react to extreme weather
  • Reduces required server instances by 33%
  • Automatically spins 156 server instances up and down
  • Decreases compute costs by 45%

AWS Services Used

Amazon EC2 C5 Instances

Amazon EC2 C5 instances deliver cost-effective high performance at a low price per compute ratio for running advanced compute-intensive workloads.

Learn more »

Elastic Fabric Adapter

Elastic Fabric Adapter (EFA) is a network interface for Amazon EC2 instances that enables customers to run applications requiring high levels of inter-node communications at scale on AWS.

Learn more »

AWS ParallelCluster

AWS ParallelCluster is an AWS-supported open source cluster management tool that makes it easy for you to deploy and manage High Performance Computing (HPC) clusters on AWS.

Learn more »

Amazon FSx for Lustre

Amazon FSx for Lustre makes it easy and cost effective to launch and run the world’s most popular high-performance file system.

Learn more »

Get Started

Organizations of all sizes across all industries are transforming their businesses and delivering on their missions every day using AWS. Contact our experts and start your own AWS journey today.