AWS for Industries
Power Trading Corporation of India uses Amazon Forecast for day-ahead and intraday electricity demand forecasting
Introduction
The purpose of this blog is to demonstrate one method of how effortlessly you can automate the data extraction, transformation, and building of an accurate power-demand forecasting pipeline using Amazon Web Services (AWS) services, such as Amazon Forecast, a time-series forecasting service based on machine learning (ML). This blog describes how Power Trading Corporation of India (PTC) used Amazon Forecast for electricity demand forecasting and saw accuracy level of 98 percent for intraday forecasting. Amazon Forecast is a fully managed service that uses advanced ML techniques to deliver highly accurate forecasts. This methodology paves the way for utilities to generate accurate ST-ELF models without incurring significant investment in artificial intelligence (AI) and ML or data scientists.
Business challenge and importance of electricity-demand forecasting
Electricity is a difficult commodity to work with. Unlike other commodities, electricity cannot be procured and used variably in the future. It has to be consumed immediately after generation. Also, the production rate should match the consumption rate to maintain the integrity of the electrical grid.
An electrical grid is a network of interconnected electricity generators (GENCO), electricity-distribution companies (DISCOM), and transmission equipment. It is the responsibility of the grid operator to make sure that the production and consumption rate match and to act if it shifts.
The Indian grid operates at a frequency of 50 Hz with a three-phase alternating current. If a GENCO injects electricity that is more than what a DISCOM can consume and use, the grid frequency goes up, and if a GENCO injects electricity that is less than what a DISCOM can consume and use, the grid frequency goes down. To avoid this, the grid operator requires both the DISCOM and the GENCO to provide a schedule for the next day in 15-minute blocks (96 blocks in a day). This block size varies from country to country. A schedule is essentially a promise to generate or consume a fixed amount of electricity for a given block of the day. Deviating from the promised amount incurs penalties from the grid operator.
Currently, various DISCOMs all over India are paying millions of dollars every year in the form of deviation penalties, dropping their profit margins, and in some cases, also filing losses. To reduce these penalties, DISCOMs need a forecasting tool that produces accurate forecasts based on previous load patterns, weather history, and calendar events. These forecasts not only help DISCOMs to avoid penalties, but they also facilitate better procurement, development, and expansion decisions.
Solution approach
PTC evaluated deployment of demand forecasting models in the cloud and on premises, and considering the advantages of scalability and manageability, PTC decided to go with a cloud-based deployment. The next architectural consideration was which AWS services to use to generate demand forecasting: develop and deploy demand-forecasting models on Amazon SageMaker (used to build, train, and deploy ML models) or make use of Amazon Forecast, which does not need model development and data science expertise.
Amazon Forecast puts the power of Amazon’s extensive forecasting experience into the hands of all developers, without requiring ML expertise. It is a fully managed service that delivers highly accurate forecasts, up to 50 percent more accurate than traditional methods.
Also considering the lean AI/ML and data-science-experienced team that PTC has, PTC decided to go with Amazon Forecast because
- the time to implement would be far less than developing and deploying custom models,
- manageability efforts would be much less,
- right algorithms can be selected by Amazon Forecast by inspecting the data, and
- Amazon Forecast scales automatically to provision required compute to generate forecasts.
With Amazon Forecast as the core service to generate day-ahead and intraday electricity-demand forecasts, PTC identified other services to build the forecast pipeline.
- Amazon Simple Storage Service (Amazon S3), an object storage service, is used to store the input data—for example, CSV files with historical data of 15-minute interval data having demand (MW) for 96 time blocks per day. The historical data contained 18 months of data.
- AWS Step Functions, a visual workflow service, is used to build the workflow.
- Input data from Amazon S3 is transformed as per-input data format (item metadata) required for Amazon Forecast, and stored in a separate bucket in Amazon S3.
- Amazon Forecast generates forecasts and output exported to another bucket in Amazon S3.
- Amazon QuickSight, unified business intelligence at hyperscale, is used to visualize actual and forecasted demands.
As shown in the following diagram, historical electric-demand data can be sent to Amazon Forecast. The service then automatically sets up a data pipeline, ingests data, trains a model based on historical data, provides accuracy metrics, and generates forecasts. It identifies features, applies the most appropriate algorithm for the data, and automatically tunes hyperparameters. Amazon Forecast will then host the models so they can be easily queried when needed. With all this work done behind the scenes, utilities can save time and effort by not building their own ML expert team or resources to maintain in-house models. In addition, Amazon Forecast Weather Index is a built-in feature that incorporates historical- and projected-weather information into the model (for the United States (excluding Hawaii and Alaska), Canada, South America, Central America, Asia Pacific, Europe, and Africa & Middle East locations, with additional ones planned over time). When the Weather Index is in use, Amazon Forecast automatically applies the weather feature to only the time series where it finds improvements in accuracy during predictor training.
The solution includes three broad steps:
- preprocessing and ingesting raw data
- train / incrementally retrain predictor model
- generate forecast and build dashboards for comparison with actuals
Step 1. Preprocessing and ingesting raw data
Input data can be provided as CSV files within an Amazon S3 bucket. There can be three types of data that one can provide as outlined below. Though some preprocessing can be done prior to uploading the data for formatting in required formats and other functions, an important aspect of preprocessing, like the handling of missing values, can be handled by Amazon Forecast during the ingesting process itself.
Input data can be provided under three sections:
- target-time-series data: input data for power-demand forecasting at state granularity includes time stamp, id/state, and actual demand (target value). The below representation depicts the same at 15-minute granularity.
- If instead of state, one wants to do forecasting for multiple cities or multiple substation levels, one can replace id/state with a city or substation identifier and have data at every granular time stamp for all the cities and substations where you want to do forecasting.
- related-time-series data: this is data apart from target value that provides additional information for the item and time stamp combination. This can be additional information such as holidays, weather, seasonality information, and so on.
- item metadata: this could be additional information about the item (in this case state, city, or substation) that generally does not change with time. Although this is more relevant in other domains, such as retail, it was not used in this power-demand use case.
Handling missing values: though Amazon Forecast provides various options to choose from, both for the place within the dataset that needs filling and what values to fill, below is what was more representative for our specific use case.
Our historical dataset had some missing values in the middle because of some gaps in recording the historical data. Therefore, we chose the middle fill for both target-time-series and related-time-series datasets. Further, we chose to use median as an appropriate approximation technique to arrive at missing values. The median was calculated based on a rolling window of 64 data entries for it to be more representative of that segment of the day.
Step 2. Train and incrementally retrain predictor model
Power-demand forecasting is a use case where we have to constantly keep creating day-ahead (next-day) and intraday (next 2 hours) forecasts, and we consistently need accuracy of 95 percent and 98 percent, respectively.
We created two separate predictors for day-ahead (next-day) forecasting and intraday (2 hours ahead) forecasting. This was done because both predictors had different sets of data points to be forecasted. For example, in day-ahead forecasting, we had to forecast 96 data points, each of 15-minute-time intervals, and in intraday we had to forecast only 8 data points, each of 15-minute-time intervals. Because the accuracy requirements were very stringent and forecast depends a lot on the immediate previous data points, having separate trained predictors while input datasets remained the same helped in maintaining the accuracy.
Also, to consistently deliver accurate predictions, retraining the model on the newer, more recent dataset becomes very important. The frequency of retraining the intraday predictor was higher than training the day-ahead predictor.
However, retraining the model comes with its own cost of training time and infrastructure incurred.
To solve for this, we used two features of Amazon Forecast:
- ingest additional and incremental data and use the existing predictor to predict day-ahead and intraday forecasting. This involved importing only incremental data (full data, along with incremental data points, needs to be reimported and not just the incremental CSV), and no retraining was done. This was also economical in terms of time to start making new predictions and saves on cost of retraining.
- ingest additional and incremental data and retrain predictor on the new dataset and then use the newly trained predictor to forecast the power demand. This involved importing incremental data (full data, along with incremental data points, needs to be reimported and not just the incremental CSV) and retraining on this new data.
Step 3. Generate forecasts and build dashboards for comparison with actuals
After the predictors were trained, we used them to generate forecasts. These predictors were run on a schedule using an initiation from Amazon CloudWatch, which collects and visualizes near-real-time data in automated dashboards, and a function in AWS Lambda, a serverless, event-driven compute service. The code in the AWS Lambda function to generate forecasts was written using a boto3 Python API. The AWS Lambda function could also be initiated on demand by passing the start and end-time intervals to generate forecasts. These forecasts were also saved in a CSV file in Amazon S3 for dashboarding and comparison with the actuals.
At the end of every day, the actual demand was also ingested as a separate CSV file for the day. Subsequently, Amazon QuickSight was used to develop forecast versus actual dashboards and share it with stakeholders.
Results
PTC applied Amazon Forecast for electricity-demand forecasting using consumption history of 2 years of data, containing 96 blocks of data for each day. Using Amazon Forecast, PTC saw results of 98 percent accuracy for intraday forecasting and 95 percent for day-ahead forecasting. This was an improvement over its prior methods, which were providing only 85 percent and 82 percent accuracy, respectively. An additional benefit was that by using the AWS Cloud, PTC could deploy Amazon Forecast–based electricity-demand forecasting in a matter of weeks (versus months on prem) and with virtually unlimited compute and storage capacity—something not possible with on-premises solutions.
References
1. https://aws.amazon.com/forecast/
2. https://docs.aws.amazon.com/forecast/?icmpid=docs_homepage_ml
3. https://github.com/aws-samples/amazon-forecast-samples/tree/main/ml_ops
4. https://docs.aws.amazon.com/forecast/latest/dg/howitworks-missing-values.html
About PTC
PTC is a pioneer in starting a power market in India and undertakes trading activities that include long-term trading of power generated from large power projects, as well as short-term trading arising from supply-and-demand mismatches that inevitably arise in various regions of the country. Since July 2001, when it started trading power on a sustainable basis, PTC has provided optimal value to both buyers and sellers while ensuring optimum usage of resources. PTC has grown from strength to strength, surpassing expectations of growth and has evolved into a Rs.3,496.60 Crore as of March 2020, with a client base that covers all the state utilities of the country, as well as some power utilities in the neighboring countries. https://www.ptcindia.com/
To learn more about how AWS is helping transform the energy industry and optimize businesses, visit the AWS for Energy page.