Advancing Maintenance Maturity of Distributed IoT Applications with AWS Greengrass and AWS Step Functions

Shane Baldacchino is a Solutions Architect at Amazon Web Services.

Customers have been asking how compute at the edge can be coupled with AWS services to advance maintenance maturity in their organization.

In this blog post, we will examine maintenance maturity through the lens of something very common: elevators. We will show how you can use AWS Greengrass and AWS IoT in combination with other AWS services to build an architecture that will help your organization predict, model, and identify a performance issue or impending failure before it occurs.

The following diagram shows a standard maintenance maturity model. Predictive maintenance (PdM) increases operational efficiency, safety, and customer satisfaction.

Providing compute at the edge

AWS Greengrass is software that provides local compute, messaging, data caching, and sync capabilities for connected devices in a secure way. With AWS Greengrass, connected devices can run AWS Lambda functions, keep device data in sync, and communicate with other devices securely, even when not connected to the internet. Using Lambda, Greengrass ensures your IoT devices can respond quickly to local events, operate with intermittent connections, and minimize the cost of transmitting IoT data to the cloud.

Most elevator systems operate autonomously. They provide limited device-to-device interaction. Because the maintenance operator has no or limited visibility into the system, the total cost of ownership (TCO) is higher than optimal.

In this example, we use Greengrass to control IoT things, our elevators. The Greengrass core is the heart of Greengrass. It runs on both x86 and ARM architectures and has modest requirements. The core provides command and control of our elevators through local long-running Lambda functions. It also aggregates and filters data and performs iterative learning. Because elevators contain hundreds of sensors, we can use Greengrass to monitor everything from motor temperature to cab speed and we can feed this information into local Lambda functions to drive maturity in our maintenance practices.
For the purpose of this blog post, we use Raspberry Pi 3 to simulate Greengrass control of the elevators. The Greengrass core is running on a Raspberry Pi 3. Our two elevators are simulated by using Rapsberry Pi 3 with a combination of Raspberry Pi Sense HAT, local Lambda functions, and device code.

Let’s look at our physical architecture:

Key elevator elements

1. Greengrass core
  
  The core is responsible for local Lambda execution, messaging, device shadows, and security, and for interacting directly with the AWS Cloud. It controls our elevators through local MQTT messages and local Lambda functions. It also sends elevator-specific metrics to the AWS Cloud.
  
  You can invoke a Lambda function on the Greengrass core through device shadow updates or as a subscriber to a local MQTT topic. We are using a Lambda function for the following:
  - To publish elevator telemetry to a local MQTT topic. The Greengrass core aggregates the telemetry before it is streamed back to the AWS Cloud for more processing and aggregation.
  - To evaluate sensor data locally. This can result in an action being performed on the elevator. For example, if the elevator motor overheats, the elevator is taken out of service and placed into maintenance.
  - Elevator floor status and availability is tracked by device shadow updates. The device shadow updates are being consumed by a custom web interface which provides a means to visualize our elevators.
  - To determine, based on the local device shadow, if the elevators are available for use. If available, the Lambda function sends the elevators to a random floor by setting the desired state. Because a Lambda function can run for an unlimited amount of time on Greengrass, this Lambda function sleeps for 30 seconds before it sends the elevators to another random floor.
2. Device-specific code
  
  On initial startup, the elevators use the Greengrass Discovery API to locate and connect securely to the Greengrass core. Our device-specific code publishes to the local MQTT queue elevator-specific data, such as motor temperature, shaft vibration, door speed, current floor, and availability. The device-specific code can also receive messages from the local MQTT queue, which provides a channel for duplex messaging.

Providing local visualization

We can consume the MQTT device shadows to visualize the status of our elevators in a web application. We can also perform remote command and control by updating the desired state of the device shadows. This, in turn, synchronizes the AWS IoT device shadow with the Greengrass device shadow which is then interpreted by our device-specific code to provide command and control.

Using this pattern of remote command and control, we can optimize the morning rush so that the elevators are relocated to the ground floor to minimize waiting. Or we can take inputs from a building control system so that when a VIP uses a key card to enter the parking garage, an elevator is prioritized and dispatched.

This pattern also allows us to provide remote command and control to an operations center to remotely move stuck elevators or take elevators in and out of service during maintenance events.
All of these scenarios are accomplished by reading the reported state of the device in AWS IoT and setting the desired state. Changes set to the desired state flow through AWS IoT to the Greengrass core and to the elevators.

Increasing our capability maturity

By using multiple AWS services, we can build an architecture that will help us achieve PdM maturity

Step 1: Providing detective capabilities

Synchronizing elevator telemetry with AWS IoT opens up a world of possibility. The service has a rules engine that has a predict function that can evaluate IoT messages against an Amazon Machine Learning (Amazon ML) model.

As an example, we can create a model to predict motor reliability and evaluate data from messages being synchronized with AWS IoT against this model.

{
 	”sql": "SELECT * FROM 'elevator/#/motor_temperature'",
	"ruleDisabled": false,
	"actions": [{
		"cloudwatchMetric": {
		"roleArn": "arn:aws:iam::XXXXXXXXXXXX:role/iam_role",
		"metricName": "maintenance-status",
		"metricNamespace": "pm-metrics",
		"metricValue": "${machinelearning_predict
		('motor_reliability',
 		'arn:aws:iam::XXXXXXXXXXXX:role/iam_role', *)
		.predictedLabel}",
		"metricUnit": "None"
		}
	}]
}

We are now moving into predictive territory.

Step 2: Notifying and orchestrating

Let’s assume the AWS IoT rules engine used a predict function against an Amazon Machine Learning model and an anomaly or leading indicator for failure has been detected. This information can be used to orchestrate a workflow that will get the elevator repaired.

By using Amazon CloudWatch alarms and Amazon Simple Notification Service (Amazon SNS), you can use an approach like the one described in the “Fanout” section of Common Amazon SNS Scenarios in the Amazon SNS Developer Guide. In this approach, you publish a message to an SNS topic not only to provide notification, but to start a workflow through a Lambda function

Step 3: Providing state coordination

We can use AWS Step Functions, a service that is part of the AWS serverless platform, to provide branch logic and coordination. Step Functions makes it simple to orchestrate Lambda functions for serverless applications.

Using Step Functions, we can construct a state machine that can take input in the form of JSON from our Lambda function. Our state machine consists of steps and transitions between each step that, depending on the issue detected, take a different path for resolution.

Each step correlates to a Lambda function. A step could be performing a manual approval process and waiting for the building supervisor to approve the work or booking the job in the ERP platform. Step Functions logs the state of each step, so if things go wrong, you can diagnose and debug problems quickly.

This example uses a manual approval process. The state machine is waiting for approval of the maintenance request. For more information about the manual approval process used in this example, see the Implementing Serverless Manual Approval Steps in AWS Step Functions and Amazon API Gateway blog post.

Summary

In this post, we described a pattern that you can use to advance PdM in your organization. AWS Greengrass brings compute to the edge, allowing you to respond to local events quickly and provide operational intelligence to your devices, even without access to the internet. Greengrass uses a Lambda programming model, which helps reduce the cost of developing IoT applications.

With local Lambda functions and the ability to use services such as Amazon Machine Learning and AWS Step Functions, you can drive maturity in your business, climb the PdM maturity curve, and ultimately save your organization time and money.

For video of the content included in this post, see the AWS Developer Day video.

If you have questions or other feedback, please leave it in the comments below.

The Internet of Things on AWS – Official Blog

Advancing Maintenance Maturity of Distributed IoT Applications with AWS Greengrass and AWS Step Functions

Providing compute at the edge

Key elevator elements

Greengrass core

Device-specific code