AWS Mobile Blog

Analyze device-generated data with AWS IoT and Amazon Elasticsearch Service

Sending device-generated data from AWS IoT to Amazon Elasticsearch Service enables several analytics and monitoring use cases such as performing full-text search for device error codes and visualizing device metrics in near real-time with Kibana. This blog post walks you through an end-to-end process of sending data to AWS IoT, indexing it in Amazon Elasticsearch, and visualizing it in Kibana.

We will begin with configuring an Amazon Elasticsearch domain to store and index device-generated data. We will then configure an AWS IoT rule to route inbound device-generated data to the Elasticsearch domain. We will use an AWS Lambda function to simulate an Internet-connected delivery truck, sending location and performance metrics to AWS IoT.  Lastly, we will use Kibana to visualize the data in near real-time. You need introductory knowledge of AWS IoT and Elasticsearch to get the most out of this blog post. You can find an introduction to AWS IoT here, and an introduction to Amazon Elasticsearch here.

 

Configure an Amazon Elasticsearch domain

First, you need to create and configure an Amazon Elasticsearch domain to store and index the delivery truck data. Go to the Amazon Elasticsearch console and create a new domain with name “delivery-fleet”.

In the domain creation wizard, you will be asked to set the domain access policy. You need to specify an access policy that: (1) allows AWS IoT to put data into the Elasticsearch domain; (2) allows intended clients (such as your desktop) to query data from the Elasticsearch domain. The access policy you choose would look similar to the example below:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::xxxxxxxxxxxx:role/iot-es-action-role"
      },
      "Action": "es:ESHttpPut",
      "Resource": "arn:aws:es:us-east-1:xxxxxxxxxxxx:domain/delivery-fleet/*"
    },
    {
      "Sid" : "",
      "Effect": "Allow",
      "Principal": {
        "AWS": "*"
      },
      "Action": ​"es:*",
      "Resource": "arn:aws:es:us-east-1:xxxxxxxxxxxx:domain/delivery-fleet/*",
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": ["xxx.xxx.xxx.xxx", "xxx.xxx.xxx.xxx"]
        }
      }
    }
  ]
}

/* Specify public IP addresses or address ranges of your intended clients (such as your desktop) in the "aws:SourceIp" list. One way to look up public IP on a client is to navigate to "https://www.google.com/#q=what+is+my+public+ip+address" on the client. */

 

Create the Elasticsearch domain. It will take a few minutes for “Domain status” to turn “Active”. When the domain is active, note the endpoint generated for the newly created domain. The endpoint will look similar to “search-delivery-fleet-xxxxxxxxxxxxxxxxxxxxxxxxxxxxx.us-east-1.es.amazonaws.com”.

Next, create an “index” in the Elasticsearch domain via an HTTP request. You can use an HTTP client such as curl or DHC. An “index” is the top level logical structure in Elasticsearch to store and index your data. For this walkthrough, create an index with name “trucks” to store the delivery truck data. While creating the index, specify a “mapping” in the HTTP request body to help Elasticsearch correctly interpret geo-location and time from the sample data. Your HTTP PUT request would look similar to the example below and you should get an HTTP 200 in response:

curl -i -X PUT 
   -d 
'{
  "mappings": {
    "truck": {
      "properties": {
        "timestamp": {
          "type": "long",
          "copy_to": "datetime"
        },
        "datetime": {
          "type": "date",
          "store": true
        },
        "location": {
          "type": "geo_point"
        }
      }
    }
  }
}
' 
 'https://search-delivery-fleet-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.us-east-1.es.amazonaws.com/trucks'

 

Configure an AWS IoT rule to route data to Elasticsearch

Now that your Elasticsearch index is ready to receive data, configure an AWS IoT rule that can route the inbound data from connected trucks to the Elasticsearch index. You can use the AWS IoT console or the AWS CLI to create a rule. Your AWS IoT rule would look similar to the example below:

{
    "sql": "SELECT *, timestamp() AS timestamp FROM 'trucks/#'",
    "actions": [
        {
            "elasticsearch": {
                "roleArn": "arn:aws:iam::xxxxxxxxxxxx:role/iot-es-action-role",
                "endpoint": "search-delivery-fleet-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.us-east-1.es.amazonaws.com",
                "index": "trucks",
                "type": "truck",
                "id": "${newuuid()}"
            }
        }
    ]
}

 

The example rule forwards any data received on the MQTT topic tree ‘trucks/#’ to the Elasticsearch index ‘trucks’. The AWS IoT service needs to assume an AWS IAM role in your AWS account to obtain the necessary privilege of inserting data into your Elasticsearch domain. In particular, the IAM role needs to be allowed to call “es:ESHttpPut” action.

 

Simulate an Internet-connected truck sending data to AWS IoT

You are now ready to send data to AWS IoT and have AWS IoT index the data into Amazon Elasticsearch. Create an AWS Lambda function using this sample code to simulate an Internet-connected truck sending location and performance metrics to AWS IoT. The code is designed to send messages such as the one below to AWS IoT at regular intervals:

    {
        "nms": 1412638168724,
        "location": "39.09972,-94.57853",
        "geoJSON": {
            "type": "Point",
            "coordinates": [
                -94.57853,
                39.09972
            ]
        },
        "pressure": 111,
        "engine_temperature": 213,
        "cargo_temperature": 41,
        "rpm": 2216,
        "speed": 18,
        "battery": 12.3
    },

 

AWS IoT generates an account-specific endpoint for your devices to send and receive data from your AWS account. You can look up this endpoint using AWS CLI command “aws iot describe-endpoint”. When you create the Lambda function, you will need to update the code with your account-specific AWS IoT endpoint.

You also need to ensure that the Lambda execution role has permission to invoke iot:Publish action. You can edit the role’s policy in the AWS IAM console to grant the required permission.

Lastly, you need to increase the execution timeout for the Lambda function from the default 3 seconds to 5 minutes. This would allow the function to add a delay in between sending several consecutive messages. Timeout can be configured under advanced settings. (Later, when you trigger the function via the console, you will get a warning “We are unable to display results and logs for invocations that take longer than 60 seconds. You can view the results and logs for the function in CloudWatch once the function completes executing”, which is okay).

You are just one click away from running the simulation. Before triggering the simulation, optionally configure one or both of the following data flow debugging tools: First, use an MQTT client to confirm delivery of truck data in and out of AWS IoT. There is a browser-based MQTT client available at the top-right in the AWS IoT console. Connect and subscribe to topic ‘trucks/#’. This will allow you to view the messages published by the truck simulator Lambda function into the AWS IoT message broker. Second, configure AWS IoT to send logs to Amazon CloudWatch Logs. Viewing the logs in CloudWatch would be useful, in case you need to debug issues such as authentication failure or rule execution failure.

Click the ‘Test’ button in the Lambda console to run the truck simulator Lambda function. You should see messages coming through at all destinations such as the browser-based MQTT client you may be using, as well as the Elasticsearch domain.

 

Explore and visualize the data in Kibana

You are now ready to explore the connected truck data. First, confirm that your Elasticsearch domain has received and indexed the truck data by querying Elasticsearch via an HTTP GET request. Use an HTTP client such as curl or DHC. Your HTTP GET request would look similar to example below and you should get an HTTP 200 with a non-zero number of ‘hits’ in the response body:

curl -i -X GET 
 'https://search-delivery-fleet-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.us-east-1.es.amazonaws.com/trucks/_search'

 

The next step is to start using Kibana. Go to the Amazon Elasticsearch console and look up the Kibana endpoint for your Elasticsearch domain. It will looks similar to this: “search-delivery-fleet-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.us-east-1.es.amazonaws.com/_plugin/kibana/”. Click through or load it in a browser.

On the “Configure an index pattern” page that you see at the beginning, enter “trucks” as the “Index name or pattern” and select “datetime” under “Time-field name”. Kibana will then list the fields found under the “trucks” index:

 

Go to the “Discover” tab to start exploring the data. By default, Kibana sets the timeframe to “Last 15 minutes”. You may need to increase it to “Last 1 hour” or other appropriate timeframe to include the time when the truck simulator Lambda function published the data. Kibana will load the data similar to the example below:

 

Next, use the ‘Visualize’ tab for more interesting visualizations. The example below is a line chart visualizing the average battery voltage over time:

 

Viewing data in near real-time

The most important aspect of AWS IoT and Amazon Elasticsearch integration is the ability to analyze device-generated data in near real-time. Now that you have walked through the end-to-end configuration, let’s look at the data in near real-time. Configure Kibana to refresh every 5 seconds and rerun the truck simulator Lambda function. You will see the battery voltage graph updating in near real-time.

 

You just completed a walkthrough of sending data to AWS IoT, indexing it in Elasticsearch, visualizing it in Kibana, and doing it all in near real-time! We hope that the newly available AWS IoT and Amazon Elasticsearch integration enables several IoT analytics and monitoring use cases for you. Try it out and let us know your feedback.