AWS for Industries
Automating Wind Farm Maintenance Using Drones and AI
Turbine maintenance is an expensive, high-risk task. According to a recent analysis from the news website, wind farm owners are expected to spend more than $40 billion on operations and maintenance over a decade. Another recent study finds by using drone-based inspection instead of traditional rope-based inspection, you can reduce the operational costs by 70% and further decrease revenue lost due to downtime by up to 90%.
This blog post will present how drones, machine learning (ML), and Internet of Things (IoT) can be utilized on the edge and the cloud to make turbine maintenance safer and more cost effective. First, we trained the machine learning model on the cloud to detect hazards on the turbine blades, including corrosion, wear, and icing. You can find the details of the machine learning part in our previous blog post. The model is then deployed on edge to achieve safer and quicker inspections that can also work offline, not requiring uninterrupted connection to the cloud. Uniquely, you can incorporate your existing maintenance team that can engage with the findings based on your custom alerts and conditions through Amazon Augmented AI. This method allows the expert to work as specified by you and keeps the machine learning model under a continuous learning process. This blog post is a continuation of re:Invent 2020 builder fair project, Automating Wind Turbine Maintenance Using Drones and AI.
The solution uses AWS Managed Services to minimize your operational overhead. The solution extensively uses AWS serverless technologies to automatically scale from zero to peak levels while saving the cost as you never over-provision the resources. The solution is also completely automatic, including human review workflows. The AWS services used, and the reason for using those services, are as follows;
- AWS Lambda, a serverless compute service, implements any business logic into the process.
- Amazon EventBridge, a serverless event bus solution, is used for managing the events and scheduling activities such as maintenance intervals for syncing ML models between the edge and the cloud.
- Amazon SNS, a fully managed messaging service, performs the application-to-person (A2P) communication to the teams on the field or office (Field Teams in Figure 1). We did not have to use Amazon SNS for application-to-application communication purposes. The services either natively communicate or require a business logic where the AWS Lambda function also performs the communication.
- Amazon S3, an object storage service, store important data such as images or machine learning models.
- Amazon SageMaker, a machine learning service, is the centerpiece of the machine learning technology in this solution. We also employed Amazon SageMaker features such as Amazon SageMaker Ground Truth to quickly label the training images and Amazon Augmented AI to perform human review processes. Amazon SageMaker Neo to optimize the ML model for inference at edge and Amazon SageMaker Edge Manager for managing and monitoring ML model at edge effectively.
- AWS IoT Greengrass, an open-source edge runtime and cloud service for IoT, stays at the center of the machine learning inference at the edge.
- AWS IoT Core, a service to securely and reliably connect billions of devices, routes the messages between the AWS and the edge devices. In our case, these devices are wind turbines.
- AWS IoT SiteWise, a managed service that collects, stores, organizes, and monitors data from industrial equipment at scale, is used for monitoring the wind turbines.
- AWS IoT Analytics, a fully managed service to perform sophisticated analytics on massive volumes of IoT data, conducts the analytical operations based on the telemetry data coming from wind turbines.
- AWS IoT Events, a service for building complex event-monitoring applications, is used for detecting possible failures and maintenance patterns for the wind turbines monitored.
- Amazon QuickSight, a scalable, serverless, embeddable, ML-powered business intelligence (BI) service, is employed in the business intelligence purpose from turbine monitoring to machine learning model behaviors.
Figure 1 – The Solution Architecture.
We grouped the steps in the architecture shown in Figure 1 into two; AI/ML and IoT.
Steps Incorporating AI/ML
The steps for the machine learning part start with step-0 where you perform the first machine learning training to kick off. Step-0 will be a one-time activity, so we assigned “0.”
0-a. You can start by uploading your reference turbine pictures and turbine defects to Amazon S3 called Image Pool in Figure 1. These pictures will be your initial groundtruth. Generally, our customers use their historically collected image database, which is more than enough to train a high confidence model. Please see this blog post for more information.
0-b. You can now use Amazon SageMaker Ground Truth for labeling the uploaded groundtruth images. Amazon SageMaker Ground Truth has user-friendly user interface that does not require AI/ML expertise to perform the labeling. You can also build your own user interface, if desired. In our proposal, you employ your existing expert teams to be the labeling workforce. However, you can also use Amazon Mechanical Turk or third-party vendor-managed workforces.
1. Since Amazon SageMaker natively works with Amazon SageMaker Ground Truth, you can use the resulting manifest file built during the Amazon SageMaker Ground Truth process. You use image classification, object detection, or semantic segmentation algorithms available on SageMaker, which are discussed in our previous blog post. Please note that Amazon Rekognition models cannot be deployed on AWS IoT Greengrass as of today. Hence, to perform inference at the edge, you need to use Amazon SageMaker.
2. The trained model files are then stored in an Amazon S3 bucket (ML Models). This model is then used by Amazon SageMaker Neo to compile and deploy a hardware-optimized model (i.e. Nvidia in our case). The compiled model is then packaged by Amazon SageMaker Edge Manager for model management at the edge devices.
3. Amazon SageMaker Edge Manager integrates with AWS IoT Greengrass V2 to make model and Edge Manager Agent deployment and maintenance easier. You use the agent to make predictions on the devices. You can follow the instructions to use SageMaker Edge Manager on Greengrass core device.
4. Gateway device in our use case was Nvidia Jetson nano board with AWS IoT Greengrass V2 installed. AWS IoT Greengrass serves multiple purposes. First, it allows you to collect telemetry data from turbines and send that to the cloud in batches (via step-11 in Figure 1). It also allows the gateway to operate in offline mode with a local device shadow feature that can run local Lambda functions. This is especially important for remote wind turbine fields where the internet connection cannot be met reliably. Finally, AWS IoT Greengrass with SageMaker Edge Manager agent also allows you to do inference on the images collected from drones and detect potential issues through the local AWS Lambda function called Assessment at Edge. You can employ any business logic based on the results of the inference at the edge by this Lambda function.
Figure 2. Scenes taken from the re:Invent 2020 builder fair video showing (a) the drone flying in front of the turbine, (b) an example corrosion represented by brown color identified as corrosion by the Amazon SageMaker.
5. We employed AWS Lambda function Assessment at the Edgeto initiate the Human Review process if the desired condition is met. For example, you can use this Lambda function to assess the confidence level of the object classification and if it is low enough or the finding is critical, the Lambda function can initiate the human workforce loop. The image is then uploaded securely by AWS IoT Greengrass to the Amazon S3 Images Pool bucket invoking Amazon S3 Event Notification.
6. The Amazon S3 event notification triggers an AWS Lambda function, called Augmented AI, to start the Amazon Augmented AI Human Loop. You can create your custom Human review Task templates for your expert teams. The team can now review the inference at the edge, performed by drones.
7. Now the Amazon Augmented AI loop kicks in. You can create your custom Human review Task templates for your expert teams. The team can review the inference result and the actual image at the edge. Even though this step is customizable, we assume that, at minimum, the expert team can accept or supersede the inference performed by the ML model at the edge. This further improves the reliability of the overall system as still including the human players in the game. Meanwhile, the expert teams are not notified of every inference since the initial business logic filters the inference results that may require an expert human reviewer.
The best part of the Augmented AI workflow is automatically retraining the ML model based on human review results (step-1). Therefore, the ML model keeps learning from human experts in time. The newly trained model then can follow steps 2 – 3 to get deployed at the edge again. This retraining process does not need to be performed after every review. A business logic via an AWS Lambda, in our case by the Assessment on the Cloud Lambda function, in Figure 1, can schedule a re-training after an adequate batch size achieved or a maintenance interval scheduled by Amazon EventBridge. On the other hand, the steps 1 – 3 and steps 5 – 6 form a loop so-called Continuous Learning Loop. The architecture allows continuous learning from human reviewers via the cloud and synchronizes back with the edge.
8. Together with Amazon EventBridge, the Lambda function assessment on the cloud can take actions based on the human review decisions.
9. One possible action is to publish the message to Amazon Simple Notification Service topic, as shown in Figure 1.
10. The team on the field gets notifications to their mobile devices.
Steps Incorporating IoT
We go back to the edge to focus on the IoT part in our solution. Since the process is continuous, we continue our numbering with step-11.
11. AWS IoT Greengrass sends the windfarm telemetry data to AWS IoT Core and from there, the data is fanned out into multiple services for various use cases. This data includes power generation (kWh), rotation per minute (rpm), and torque (Nm), etc. read from turbines.
12.AWS IoT Core rules engine sends the data to AWS IoT Events to build a detector model for the turbines, which allows the system to detect various error states and alert maintenance staff via Amazon SNS. You can follow these steps to build your detector model.
13. The turbine data is also sent to AWS IoT SiteWise Monitor, a feature of AWS IoT SiteWise that provides portals in the form of managed web applications. Domain experts can use these portals to get insights into the data and build operational dashboards for site supervisors to monitor wind turbines.
14. Finally, the data is sent to AWS IoT Analytics for batch analytics. The output of this step is used for building business intelligence reporting via Amazon QuickSight.
In this post, we proposed a solution to automate wind turbine visual inspection using AWS services. The solution has a serverless architecture that utilizes AWS IoT, and AI/ML technologies to employ drone imaging-based inspection at the edge. The solution brings complete automation while still employing the existing expert employees into the whole inspection process when necessary. The machine learning model keeps learning from the human reviewers through a continuous learning loop. The trained model is then deployed back to the edge. Moreover, the solution can work in offline mode using AWS IoT Greengrass.