AWS HPC Blog

Amazon’s renewable energy forecasting: continuous delivery with Jupyter Notebooks

Amazon’s renewable energy forecasting- continuous delivery with Jupyter Notebooks copyYou might associate the phrase ‘Jupyter Notebooks in production’ with a scrappy startup short on engineers or a hobbyist tinkering in their free time. However, this story unfolds at Amazon, where a team transitioned from requiring software engineers to replicate scientists’ work for production, to enabling scientists to seamlessly deploy Jupyter Notebooks into production.

Amazon’s Renewable Energy Optimization team produces software to maximize the effectiveness of our portfolio of wind and solar farms. The team develops and runs machine learning models that forecast the state of the electricity grid in the next few days. As Amazon’s portfolio of wind and solar farms has expanded, the techniques that our scientists and software engineers use to create and run these machine learning models has evolved, too.

In this post, we’ll walk you through a science-to-production workflow that’s probably familiar to you, explain the logic that supported our novel approach and – finally – let you benefit from what we learned implementing it all.

The problem

Initially, scientists and software engineers were separated. Scientists developed models and optimization techniques using Jupyter Notebooks. They’d pass these to a software engineer who would translate them into ‘production code’. Scientists never wrote production code.

Although this worked for our initial deployments, the process had several limitations:

It was slow. Once a scientist settled on a model they wanted to use in production, it took two or three months for an engineer to translate that code into production. This created a bottleneck for the scientists and ran against the model of continuous delivery.

Nobody understood the production code. The engineer translating the notebook didn’t fully understand the science being used in the code. Once the code had been translated, it was difficult for the scientist to understand because the engineer changed the code structure. We ended up with code that no one understood. This made it difficult to debug errors and issues in the production code.

There were inconsistencies between environments. Scientists developed the notebooks using Amazon SageMaker or their local machine using dependencies they had installed manually. The engineer built the code using an Amazon internal build system. This frequently meant there were different versions of libraries in use. Subtle differences led to large differences between what scientists were expecting and what happened in production.

We needed engineers and scientists to use the same platform. This meant either getting scientists to write production-level code or we had to run Jupyter Notebooks in production (a little heretical).

What’s a Jupyter Notebook and why do scientists like them?

A Jupyter Notebook is a JSON document that contains both the source code and output of the code execution. Users execute cells of the notebook and then the output of each cell run is saved into the JSON notebook itself. Every time a notebook is run, it gets mutated and overwritten with the latest run.

Figure 1 – Jupyter Server reads the JSON Notebook file on disk and displays it to the user. The user then requests commands to be executed which the Jupyter Server send to the Jupyter Kernel to perform. The JupyteServer receives the output of the commands from the Jupyter Kernal and updates the Notebook on disk.

Figure 1 – Jupyter Server reads the JSON Notebook file on disk and displays it to the user. The user then requests commands to be executed which the Jupyter Server send to the Jupyter Kernel to perform. The JupyteServer receives the output of the commands from the Jupyter Kernal and updates the Notebook on disk.

Jupyter Notebooks have a lot of advantages for scientists. They allow for quick iteration of ideas because a single operation can be run multiple times without having to run the entire program again. They’re also great for visualizing and analyzing the output from code. Tables, charts, or images can be displayed in the notebook, right next to the source code that generated them.

Figure 2 – Information displayed a graph in a Jupyter Notebook vs the same information in CloudWatch Logs.

Figure 2 – Information displayed a graph in a Jupyter Notebook vs the same information in CloudWatch Logs.

However, Jupyter Notebooks are not designed to run in production. There’s no built-in functionality to programmatically run a notebook with different parameters. They’re also not easy to test and they’re constantly evolving documents that change every time they are run which makes collaboration and reproducibility difficult. So despite Jupyter Notebooks offering desirable features, they weren’t compatible with our software engineering best practices for production deployments.

Why did we change our mind?

Our views changed when we discovered the framework Papermill and its usage at Netflix and other industry leaders. Papermill is a library for executing notebooks programmatically. Specifically, it allowed us to turn a Jupyter Notebook into an immutable object, and then parametrize it.

Papermill replaces the user and UI. It acts as another client to the Jupyter Kernel using the same protocol as the Jupyter Server. However, rather than overriding the source notebook each time it runs, it will output a new notebook for each run. Therefore, each time Papermill runs a notebook, a new notebook is created. The notebook has gone from being a mutable document to immutable source code.

Additionally, Papermill allows each run to have different parameters. This allows a single notebook to perform a solar forecast for different locations based on a parameter that’s passed to the notebook.

Figure 3 – With Papermill, the Jupyter Server is replaced. Papermill is responsible for sending the commands to the Jupyter Kernel and reading and writing the output notebooks to disk.

Figure 3 – With Papermill, the Jupyter Server is replaced. Papermill is responsible for sending the commands to the Jupyter Kernel and reading and writing the output notebooks to disk.

We now had a way to programmatically trigger a parametrized notebook. The next step was to ensure the scientists’ development environment matched the production environment.

Keeping science environment consistent with production

When we had to answer why one of our wind farms had been underperforming at one of Amazon’s infamous “Correction of Errors” meetings, it wasn’t the first time someone said, “It worked on my machine!”. This was not a new problem and we knew existing technologies existed to solve it.

Initially, scientists used the pre-built SageMaker images to manage the dependencies for their notebooks. These images are useful for experimentation; however, we didn’t have control of the exact version of each library that we were using. Specifically, some scientific Python libraries had the same version, but the systems used different linear algebra libraries which affected the end result of our forecasts. We therefore decided to bring our own custom image into SageMaker.

SageMaker custom images allows you to import an image from Amazon Elastic Container Registry (Amazon ECR) that you have built yourself. This could be built in AWS CodeBuild or in another build system outside of AWS. We used this feature of SageMaker to build our own custom images with all the dependencies decided by our internal AWS Code Artifact Repository.

The same image being imported into SageMaker is exactly the same image that will be deployed to production and used on our development machines. This ensures when scientists are using a third party dependency, like Numpy or Pandas, they have the confidence that it’s exactly the same version that will be used in production.

Once the scientist is happy with the notebook they’ve been experimenting with, they’ll submit it to our internal build system. The end result is a Docker image that contains the notebook as well as all of its dependencies.

Figure 4 – Scientists use Amazon SageMaker Studio for experimentation using the same docker image that is deployed into production.

Figure 4 – Scientists use Amazon SageMaker Studio for experimentation using the same docker image that is deployed into production.

How do the notebooks run in production?

Once a notebook has been built into the image, it can be run in production. The image has the notebook as well as a framework we built that runs the notebook using the Papermill library. An Amazon EventBridge rule triggers the notebook to run at the appropriate time with the appropriate parameters.

AWS Batch runs the notebook in Amazon Elastic Container Service using the image that we pushed to the ECR repository. After the notebook finishes executing, we save the raw JSON notebook and the HTML representation of the notebook to Amazon Simple Storage Services (Amazon S3). These output notebooks can also be viewed in a web browser via an Amazon CloudFront distribution in front of the Amazon S3 bucket.

Figure 5 – Notebooks are run on Amazon ECS using AWS Batch. The Batch jobs are triggered by schedules in Amazon EventBridge.

Figure 5 – Notebooks are run on Amazon ECS using AWS Batch. The Batch jobs are triggered by schedules in Amazon EventBridge.

We use Amazon DynamoDB to store the status of each notebook run as well as additional metadata about the notebook run. Engineers and scientists can then use this information to debug the status of a notebook run. Even better, non-technical users can view that status of our forecasting and see key outputs in the notebook itself.

Figure 6 – Internal website that displays the list of notebook runs that have been saved into DynamoDB.

Figure 6 – Internal website that displays the list of notebook runs that have been saved into DynamoDB.

We set up the infrastructure and build pipeline using AWS CDK which enables us to treat our infrastructure as code. This simplifies spinning up new environments, minimizes configuration errors, and leverages high-level constructs for faster, more reliable work.

Conclusion

We started this project with the aim of allowing scientists to deploy more frequently without relying on engineers. We achieved that, but we also improved how both scientists and engineers interact with the production system. They can debug production notebooks through a web server that displays the output notebooks. This led to the use of these notebooks for reports and metrics presentations – accessible to others via shared links.

The infrastructure choices also increased our velocity. Previously our system was computationally intensive and it took two weeks to evaluate it sequentially on a very large machine. AWS Batch allows us to run more than a thousand much smaller instances in parallel so our model evaluation went down to 3 hours. Solving these problems allowed our small 2-pizza team to dramatically increase our iteration speed and at the same time increased the reliability of our system.

At the outset of this work, scientists couldn’t tell their containers from their images – and the only thing engineers knew about Jupyter Notebooks was that they “were not for production!”. But, by taking the time to understand each other’s tools and the problems that they were solving we built a system that let us to have consistent environments and continuous delivery of new science models.

Alec Hewitt

Alec Hewitt

Alec is a Senior Software Engineer on the AWS Renewable Energy Optimization team. He is fascinated with the role Software can play in transitioning the world to clean energy. When he is not programming, he can be found riding with his local cycling club.

Will Sorenson

Will Sorenson

Will is a Senior Applied Scientist in the AWS Renewable Energy Optimization team, and spends his working days accelerating the transition to clean energy for AWS.