AWS HPC Blog
Building a Scalable Predictive Modeling Framework in AWS – Part 1
Predictive models have powered the design and analysis of real-world systems such as jet engines, automobiles, and powerplants for decades. These models are used to provide insights on system performance and to run simulations, at a fraction of the cost compared to experiments with physical hardware. Our aim with the predictive modeling framework is to enable a broad group of end users build and deploy predictive models without worrying about the underlying techniques, orchestration mechanisms, or the infrastructure required.
However, the use of such models for making operational decisions has been limited due to lack of scalable tools and ease of use of distributed computing systems. For example, a power plant manager using a physics-based model (with hundreds of parameters) for performance prediction may want to update the model with new observations. For this task, the power plant manager at a minimum would need to:
- understand the model (e.g., what parameters in the model needs to be updated)
- know of the techniques that may be useful for updating the model (e.g. Kalman filters),
- deploy the model in a way that the technique can communicate with the model, and
- deploy the technique at scale to update the model.
The adoption of these techniques has thus been limited to small-scale problems, given the complexity of the process for even a basic task as “updating models”.
Introducing the aws-do-pm framework
We have adapted the open source AWS DevOps for Docker (aws-do-docker) for predictive modeling and are now making it available as a separate open source project, the aws-do-pm framework. The framework allows users to deploy predictive models at scale across a distributed computing architecture, with the capability to probabilistically update models using real-world data, while maintaining the full history of user actions.
The aws-do-pm framework caters to both the end users and advanced developers. The end users can use the framework to achieve individual tasks such as updating models or quantifying uncertainty without the burden of understanding the underlying techniques or infrastructure. The framework is built to be extensible for advanced users. For example, users can register data, models and techniques, built outside of aws-do-pm, in the aws-do-pm framework. Once registered, they can be used along with existing techniques to build new applications. The aws-do-pm framework is organized as shown below.
The architecture of aws-do-pm (Fig. 1) consists of three layers: Services, Entities, and Interfaces. The architecture is containerized and implemented as a AWS DevOps for Docker project. It can run locally or on the cloud. It supports EKS and Docker/docker-compose as target container orchestrators. Users and applications interact with aws-do-pm framework through its CLI and SDK interfaces. Additional details can be found here.
The aws-do-pm project is designed to spin up the infrastructure just-in-time (on demand as needed), deploy the services required to run the specific task (such as model building or sensitivity analysis) requested by the user, complete the task, save the state of the entire system, and then gracefully spin down all services and infrastructure before exiting.
The entire system state is thus reduced to the compressed storage at the end of each run (if the user so desires) automatically, thus reducing the footprint to the absolute bare minimum. The framework is designed to be run both on AWS and on the user’s local resources (e.g., laptop). The blocks shown above (Data, Techniques, Asset, and Model) are extensible by the user, without any need to modify the underlying infrastructure. We will demonstrate the aws-do-pm framework’s capabilities using data and models from a fleet of simulated electric vehicles, in the subsequent blogs.
In this first post of three, we described the motivation and general architecture of the open-source aws-do-pm framework project. In our second post, we will show you how to use the framework to create a sample application for predicting the life of batteries in a fleet of electric vehicles. In the final part of this series, we we will use the synthetic dataset and the models generated in the second part to showcase how to update the model and perform a sensitivity analysis using aws-do-pm.