The Internet of Things on AWS – Official Blog

Part 2/2: Building Reliable IoT Device Software Using AWS IoT Core Device Advisor

This post was co-written by David Walters, Sr Partner Solutions Architect, AWS IoT, and Pavan Kumar Bhat, Sr. Technical Product Manager, AWS IoT Device Ecosystem.

Introduction

This is the second blog in a two-part series. In the first blog, I explained the importance of testing IoT devices and how AWS IoT Core Device Advisor works. In this blog, I share my experience using Device Advisor to debug a software application on an IoT device with a real-world industrial condition monitoring example.

Solution Overview

First, let’s examine an IoT solution built with AWS Partner solutions and AWS services to monitor and visualize the state of three water pumps controlled by Programmable Logic Controllers (PLC). PLCs are ruggedized computers used to control industrial assets.

A diagram of an IoT solution built with AWS Partner solutions and AWS services to monitor and visualize the state of three water pumps controlled by Programmable Logic Controllers (PLC)

Figure 1: Solution architecture for water pump condition monitoring.

The solution highlighted in this blog post covers the following AWS Partner solutions and AWS services:

  • Everyware Software Framework (ESF) is a flexible application framework that runs on IoT gateways and integrates field protocol libraries. ESF has been qualified for AWS IoT Core by Eurotech, and devices running ESF are listed in the AWS Partner Device Catalog.
  • AWS IoT Core lets you connect IoT devices to the AWS Cloud without the need to provision or manage servers. AWS IoT Core can support billions of devices and trillions of messages, and can process and route those messages to AWS services and to other devices reliably and securely.
  • Amazon Timestream is a fast, scalable, and serverless time series database service for IoT and operational applications that makes it easy to store and analyze trillions of events per day.
  • Amazon Managed Service for Grafana (AMG) is a fully managed and secure data visualization service that enables customers to instantly query, correlate, and visualize operational metrics, IoT data, and traces for their applications from multiple data sources.

Each PLC is generating data about the state of the water pumps and communicating over Modbus TCP, a standard communication protocol used by PLCs, to an IoT gateway. Everyware Software Framework is running on the gateway device. An ESF Wires application is configured to collect data over the Modbus TCP protocol and the data is published to AWS IoT Core using the AWS IoT Core Connector for ESF.

Once the data arrives on AWS IoT Core, an IoT Rule is triggered for each message and moves data into Amazon Timestream. Finally, the data is visualized in Amazon Managed Service for Grafana for operators to see the real-time status of the water pumps. If the IoT device is not reliable while deployed in the field, critical condition monitoring data may not be seen by an operator, leading to pump downtime and possible flooding.

To see the condition and operating status of the water pumps, I open up the Amazon Managed Service for Grafana dashboard.

Amazon Managed Grafana dashboard with no data available.

Figure 2: Amazon Managed Grafana dashboard with no data available.

Using Device Advisor to Debug the Device Software

As we see in the screen above, no data is currently appearing on the Grafana dashboard. This can indicate that data is not arriving on AWS IoT Core due to connectivity or reliability issues. AWS IoT Core Device Advisor can help debug such issues. To start debugging, I create a test suite from the AWS IoT Device Advisor console specific for my solution with the following prebuilt test cases:

  • Security_Device_Policies – The AWS IoT Policy associated with the device should not allow the device to perform more actions than needed for the application. This test ensures that best practices are followed and that the device does not include wildcard policy statements.
  • TLS_Connect – The TLS Connect Test Case ensures that the device and software meets all of the required TLS 1.2 requirements and implements certificate-based mutual authentication. If the device cannot establish TLS mutual authentication with AWS IoT Core, the device will not be able to connect.
  • MQTT_Connect – Once the TLS connection has been established, the device must use the MQTT 3.1.1 application protocol with AWS IoT Core. Any errors on the device’s implementation of the MQTT client protocol can cause the device connection to fail. This test checks that the device can authenticate, establish a connection the MQTT broker, and send a valid MQTT CONNECT packet.
  • MQTT_Publish – This test case ensures that the device correctly implements an MQTT Publish packet, and publishes on the correct topics for your IoT application. I configured three test cases to check that the device publishes to ‘ws/zone1/pump1/dt’, ‘ws/zone1/pump2/dt’ and ‘ws/zone1/pump3/dt’ to ensure the IoT Rule is triggered on ‘ws/#’.
  • Shadow_Publish_Reported_State – This test case ensures that the device implements the Device Shadow protocol. The device should publish each water pump’s reported motor state to a Classic Shadow when it first connects to AWS IoT Core.
Test suite creation from the AWS IoT Core Device Advisor console

Figure 3: Test suite creation from the AWS IoT Core Device Advisor console.

Once the test suite is set up, I configure Everyware Software Framework to connect to the Device Advisor endpoint.

The MQTT broker endpoint is changed to point to the Device Advisor endpoint under the ‘Broker-url’ configuration parameter in ESF

Figure 4: The MQTT broker endpoint is changed to point to the Device Advisor endpoint under the ‘Broker-url’ configuration parameter in ESF.

I start the Device Advisor test suite from the AWS IoT Core Management Console and select PumpGateway-001 as the test device. When each test case moves from Pending to In Progress, I connect ESF to the Device Advisor endpoint.

Once the Test Suite has completed, results are displayed in the Device Advisor console.

AWS IoT Core Device Advisor console displays the test run results indicating multiple test failures

Figure 5: AWS IoT Core Device Advisor console displays the test run results indicating multiple test failures.

According to the Device Advisor test report, the Policy Test failed due to a policy that contains a wildcard (‘*’). The policy should be changed to ensure that the device is only allowed to connect with its Client ID, publish on specified topics, and update its Device Shadow. I replaced the device’s policy with a more restrictive policy, and attached it to the device certificate. For more information on IoT Policy best practices, see the documentation.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "iot:Connect"
      ],
      "Resource": [
        "arn:aws:iot:<region>:<accountID>:client/PumpGateway-001"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "iot:Publish"
      ],
      "Resource": [
        "arn:aws:iot:<region>:<accountID>:topic/ws/zone1/pump1/data",
        "arn:aws:iot:<region>:<accountID>:topic/ws/zone1/pump2/data",
        "arn:aws:iot:<region>:<accountID>:topic/ws/zone1/pump3/data",
        "arn:aws:iot:<region>:<accountID>:topic/$aws/things/PumpGateway-001/shadow/update"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "iot:UpdateThingShadow"
      ],
      "Resource": [
        "arn:aws:iot:<region>:<accountID>:thing/PumpGateway-001"
      ]
    }
  ]
}

The TLS_Connect test passed because ESF has successfully implemented TLS 1.2 using the AWS IoT Device Java SDK v2.

The MQTT_Connect failed, and subsequent tests also failed because they are dependent on the device connecting to Device Advisor using the MQTT protocol.

The System Message for the MQTT_Connect, ‘Test case time out’, indicates that the timeout was reached before the device sent an MQTT CONNECT packet. To investigate with more detailed information, I click on the ‘Test case log’ link to open Amazon CloudWatch Logs.

Amazon CloudWatch Logs displaying the TLS handshake

Figure 6: Amazon CloudWatch Logs displaying the TLS handshake

The CloudWatch log shows that a successful TLS handshake occurred, but the device did not send a CONNECT packet to the Device Advisor endpoint. The device is able to establish a connection to the Device Advisor endpoint, and before the device sends the MQTT CONNECT packet, the server closes the connection with the device.

Next, I check the registered certificate on AWS IoT Core to ensure that it is valid and active.

This shows what the AWS IoT console looks like when the certificate on AWS IoT has been revoked.

Figure 7: The certificate on AWS IoT has been revoked.

The certificate has been revoked, which causes the MQTT connection to fail. Certificates can be revoked by administrators or from automated mitigation actions configured in AWS IoT Device Defender. A certificate may be revoked to prevent devices utilizing that certificate from connecting to AWS IoT Core due to a security threat or misbehavior. In order to ensure my device remains secure and can connect to AWS IoT Core, I generate a new certificate and rotate the certificate in ESF.

Once the certificate has been rotated, I re-run the Device Advisor tests and view the results in the AWS IoT Core Device Advisor console.

Device Advisor test report shows test case failures.

Figure 8: Device Advisor test report shows test case failures.

The MQTT Connect test is successful now. The CloudWatch Log for this test case shows 2 events after the TLS handshake, indicating the device under test (DUT) sent a CONNECT packet, and Device Advisor responded with a CONNACK:

CONNECT and CONNACK messages exchanged with the device and Device Advisor

Figure 9: CONNECT and CONNACK messages exchanged with the device and Device Advisor

There are 2 other test case failures related to the configuration of my ESF application.

MQTT Publish Test Pump 2 test case failed and Device Advisor reported ‘Test case time out. Did not receive MQTT PUBLISH message with topic: ws/zone1/pump2/data.’ To debug this issue, I review the CloudWatch Logs for the test case.

Amazon CloudWatch Logs showing published messages to AWS IoT Core Device Advisor.

Figure 10: Amazon CloudWatch Logs showing published messages to AWS IoT Core Device Advisor.

PUBLISH events are recorded that indicate that the device has published to the incorrect topic.

I changed the topic from ‘enmonitor/zone1/pump2/data’ to the correct topic ‘ws/zone1/pump2/data’ in the ‘Topic Id’ field of the ESF Publisher component for Pump 2.

Pump 2 Publisher component in Everyware Software Framework

Figure 11: Pump 2 Publisher component in Everyware Software Framework.

The Device Shadow test failed with the following error message: ‘Shadow document reported by device does not match expected reported state. Test case received the following shadow document from the device: {“state”:{“reported”:{“MOTOR_RUNNING”:true}},”thingName”:”PumpGateway-001″}’

The expected reported state is defined in the Test Suite creation script:

"REPORTED_STATE": {
                    "PUMP1_MOTOR_RUNNING": True
                }

In my application I chose to use a single Classic Shadow for all water pumps to reduce MQTT messaging costs related to updating multiple small Named Shadow documents for each water pump. The context must be kept for each reported motor state by including the pump name in the reported parameter. To change the name of the value, I edit the JavaScript Filter ESF Wire component that is used to extract the motor state and transform the message before publishing to AWS IoT Core Device Shadow service.

AWS IoT Device Shadow implementation as a JavaScript Filter in Everyware Software Framework

Figure 12: AWS IoT Device Shadow implementation as a JavaScript Filter in Everyware Software Framework

After reconfiguring my ESF application, I run the Device Advisor test suite one more time to ensure all failed test cases are now passing.

All test cases are now passing in AWS IoT Core Device Advisor

Figure 13: All test cases are now passing in AWS IoT Core Device Advisor

Now that all the Test Suite Test Cases in AWS IoT Core Device Advisor are passing, I change the endpoint in ESF to the AWS IoT Core ATS endpoint and connect the device. Data is now arriving in the Grafana dashboard and the application is running as expected.

Data is now arriving on the Amazon Managed Service for Grafana dashboard.

Figure 14: Data is now arriving on the Grafana dashboard.

About Eurotech

Eurotech is among the first AWS Partner Network partners to use Device Advisor to qualify their IoT devices for the AWS IoT Core Qualification Program. Using Device Advisor, Eurotech is able to test their software framework, Everyware Software Framework, to ensure their customers build reliable IoT device software. Everyware Software Framework passed all test cases in the Qualification Test Suite and is qualified for AWS IoT Core. Devices running ESF, such as the BoltGATE 20-31, are listed in the AWS Partner Device Catalog.

Conclusion

In this blog post, we examined a real-world use case where several common user errors, such as using a revoked certificate and publishing on incorrect topics, were spotted using AWS IoT Core Device Advisor. I was able to quickly test, debug, and resolve device software issues using Device Advisor, enabling me to spend more time on gaining insights from the data that my device produces. You can test your device’s reliable connectivity, security, and specific use case requirements using Device Advisor by testing through the management console, or through a Continuous Integration/Continuous Deployment pipeline using the AWS CLI or AWS SDKs.