Now available in Amazon SageMaker: DeepAR algorithm for more accurate time series forecasting

Today we are launching Amazon SageMaker DeepAR as the latest built-in algorithm for Amazon SageMaker. DeepAR is a supervised learning algorithm for time series forecasting that uses recurrent neural networks (RNN) to produce both point and probabilistic forecasts. We’re excited to give developers access to this scalable, highly accurate forecasting algorithm that drives mission-critical decisions within Amazon. Just as with other Amazon SageMaker built-in algorithms, the DeepAR algorithm can be used without the need to set up and maintain infrastructure for training and inference.

Forecasting is everywhere

Forecasting is an entry point to applying machine learning across many industries. Whether it’s optimizing the supply chain through better product demand forecasts, allocating computing resources more effectively by predicting web server traffic, or saving lives by staffing hospitals to meet patient needs, there are few domains where investments into accurate forecasts don’t return their investments quickly.

Within Amazon, we use forecasting to drive business decisions across a variety of use domains. Some of these include forecasting the product and labor demand in our fulfillment centers — in particular for key dates such as Prime Day, Black Friday, and Cyber Monday — or making sure that we can elastically scale AWS compute and storage capacity for all AWS customers. Scientists at Amazon develop algorithms such as DeepAR to solve these types of real-world business applications at Amazon scale with high accuracy.

DeepAR algorithm highlights

The DeepAR forecasting algorithm can provide better forecast accuracies compared to classical forecasting techniques such as Autoregressive Integrated Moving Average (ARIMA) or Exponential Smoothing (ES), both of which are implemented in many open-source and commercial software packages for forecasting. The DeepAR algorithm also supports other features and scenarios which make it particularly well-suited for real-world applications.

Cold start forecasting

A cold start scenario occurs when we want to generate a forecast for a time series with little or no existing historical data. This occurs frequently in practice, such as in scenarios where new products are introduced or new AWS Regions are launched. Traditional methods such as ARIMA or ES rely solely on the historical data of an individual time series, and as such they are typically less accurate in the cold start case. Consider the example of forecasting clothing items such as sneakers. A neural network-based algorithm such as DeepAR can learn typical behavior of new sneaker sales based on the sales patterns of other types of sneakers when they were first released. By learning relationships from multiple related time series within the training data, DeepAR can provide more accurate forecasts than the existing alternatives.

Probabilistic forecasts

DeepAR also produces both point forecasts (e.g., the amount of sneakers sold in a week is X) and probabilistic forecasts (e.g., the amount of sneakers sold in a week is between X and Y with Z% probability). The latter forecasts are particularly well-suited for business applications such as capacity planning, where specific forecast quantiles are more important than the most likely outcome. For example, a system that automatically places orders for sneakers based on the forecast might want to generate order quantities such that the warehouse is stocked to satisfy customer demand with X% probability. With probabilistic forecasts this can easily be achieved by basing the order quantity on the X% quantile of the forecast. Customers can leverage this functionality by specifying the appropriate likelihood function hyperparameter and specifying the desired quantiles at time of inference.

The graphs below showcase both of these forecasting scenarios using example demand forecasts produced by DeepAR for products sold on Amazon. The first figure shows a cold start scenario. Since the model shares information across items, predictions are reasonable even with limited historic data. The second and third figures show that DeepAR can produce probabilistic forecasts for products with different magnitudes by using an appropriate likelihood function for this setting (negative binomial).

The DeepAR algorithm also comes with a number of other features:

Support for different types of time series: real numbers, counts, and values in an interval
Automatic evaluation of model accuracy in a backtest after training
Engineered to use either GPU or CPU hardware to train its long short-term memory (LSTM) based RNN model quickly and flexibly
Scales up to datasets comprising 100,000+ time series
Support for training data in JSON Lines or Parquet format

Getting started

DeepAR trains a model based on observed historical data, which is then used to perform predictions. Just like other Amazon SageMaker algorithms, it relies on Amazon Simple Storage Service (Amazon S3) to store the training data and the resulting model. Amazon SageMaker automatically starts and stops Amazon Elastic Compute Cloud (Amazon EC2) instances on behalf of customers during training. After the model is trained, it can be deployed to an endpoint that will compute predictions when requested. For a general, high-level overview of the Amazon SageMaker workflow, please refer to the SageMaker documentation. Here we will give a quick overview on how to perform these steps specifically with DeepAR.

Data formatting

The first step is to collect and format historical data on the processes you want to forecast. DeepAR supports two types of data files: JSON Lines (one JSON object per line) and Parquet. The DeepAR documentation describes both options in detail.

For example, a JSON file containing data to train on could look as follows:

 {"start": "2016-01-16", "cat": 1, "target": [4.962, 5.195, 5.157, 5.129, 5.035, ...]}
 {"start": "2016-01-01", "cat": 1, "target": [3.041, 3.190, 3.462, 3.655, 4.114, ...]}
 {"start": "2016-02-03", "cat": 2, "target": [4.133, 4.222, 4.332, 4.216, 4.256, ...]}
 {"start": "2016-02-08", "cat": 2, "target": [2.517, 2.818, 3.043, 3.144, 3.293, ...]}
...

Each object could represent daily sales (in thousands) of a particular type of shoes, with "cat": 1 indicating sneakers and "cat": 2 indicating snow boots. Note that each time series has its own starting point in time; the data does not need to be aligned in this sense.
After the data is correctly formatted, you will need to push it to a S3 bucket so DeepAR can access it during training. There are multiple ways to do this, including either the low-level AWS SDK for Python (Boto3) or the AWS Command Line Interface (CLI).

Training job setup

To start a training job, customers can either use the low-level AWS SDK for Python (Boto3), the high-level Amazon SageMaker Python SDK or the AWS Management Console. In this example we will illustrate how to do so from the AWS Management Console.

To start setting up a DeepAR training job, select “DeepAR forecasting” in the Algorithm section of the Job settings page of the Amazon SageMaker console. Here customers are also be able to customize other properties of the job, such as the type and number of EC2 instances to use during training.

After you select “DeepAR forecasting,” a form with customizable hyperparameters appears. For example, you can customize the time granularity of the time series with the time_freq parameter, set a minimum for how much data from the past the model should use to make a prediction with the context_length parameter, and set the length of the output forecast using the prediction_length parameter. DeepAR uses a LSTM recurrent neural network (RNN), so other customizable hyperparameters for the network topology and training procedure are also available for advanced tuning. Refer to the DeepAR documentation for a full description of the available hyperparameters and their corresponding ranges.

After the hyperparameters are specified, you need to specify the location of the training data on Amazon S3. You can specify the following two channels: the “train” channel (required) contains the training data, while the “test” channel (optional) can be used to validate model performance. Note that if a “test” channel is not specified, DeepAR will not validate model performance on a hold-out dataset.

When using the “test” channel, Amazon SageMaker will automatically log error metrics at the end of the training job that you can then inspect from Amazon CloudWatch.

The last step is to specify where to write the trained model in Amazon S3. After you complete these specifications, choose Create training job to start the job.

Deploying the model to an endpoint for inference

As soon as the training job is completed, a model will appear in the Models page of the Amazon SageMaker console. By selecting it and choosing Create endpoint, you will be able to customize the endpoint name and configuration, such as the type and number of EC2 instances that should be used for it.

After the endpoint is deployed, you can make requests to it by providing time series as JSON objects and obtaining forecasts for each of them as a result. This can be done using the Amazon SageMaker Python SDK.

This example shows a request for a single time series. The request asks for the model to return the mean of 50 predictions sampled from the trained model, and is returned as follows:

{"predictions":[{"mean":[4.0912151337,4.5771856308,4.7047519684, ... ]}]}

The DeepAR documentation fully describes the syntax and all the options that can be included in requests.

Further documentation

For more information, see the DeepAR documentation, or for a hands-on walkthrough using this new algorithm from a SageMaker Notebook Instance, see the DeepAR example notebook. For more details on the mathematics behind DeepAR, see the academic paper written by several Amazon machine learning scientists as well as related work on RNN forecasting methods.

Additional reading

Learn how to build Amazon Sagemaker notebooks backed by Spark in Amazon EMR.

About the Authors

Tim Januschowski, Valentin Flunkert, David Salinas, Lorenzo Stella, Jan Gasthaus, and Paul Vazquez work in the Core Machine Learning group in Berlin, Germany. They work on a variety of forecasting problems within Amazon. David Arpin is AWS’s AI Platforms Selection Leader and has a background in managing Data Science teams and Product Management.