The Internet of Things on AWS – Official Blog

Integrate open source InfluxDB and Grafana with AWS IoT to visualize time series data

Across numerous types of implementations, a large portion of IoT applications collect large volumes of telemetry data. From industrial use cases to healthcare, and from consumer goods to logistics, IoT telemetry data points are highly time-dependent. In most IoT solutions, when the data is collected and reported matters for several reasons. For instance, in attribution analyses for anomaly detection or predictive maintenance scenarios, the sequence of events that have caused or are about to cause failures needs to be accurately stored, precisely documented, and understood.

Time-variant systems are important not only at an individual IoT device level but also at IoT application levels. For instance, on a factory floor the speed of movement of a conveyor belt in conjunction with its power drive and weight of the parts it carries at any given moment may provide better indicators of belt failure than when only data elements captured from the belt drive alone are considered. Additionally, the sequence of events/data preceding any particular failure can be predominantly understood as time and data flow charts.

In time-variant IoT applications it is imperative that the time drift between devices and/or between sensors and gateway software (such as AWS Greengrass) be explainable and managed. An effective way to manage the time drift across IoT applications elements is to attach a timestamp at the ingestion of each telemetry data payload into the AWS IoT Core. A key element to recall is that AWS IoT Core does not guarantee the order of data ingested and, for this reason, even if you add a timestamp at ingestion, it is best practice for IoT devices/sensors to have a uniformly incremental sequence number for each payload (or a timestamp, if possible).

This blog details how to develop a time-variant IoT solution using basic AWS IoT components and a time series optimized InfluxDB instance to store telemetry data. It also sets up a time series visualizations tool called Grafana. Both InfluxDB and Grafana are open-source.

An AWS IoT device simulator generates high frequency time series data. Data is ingested by the AWS IoT Core and a Lambda function is triggered by the rule engine to insert the data into a time series specialized database. In our case, we use an EC2 instance where we installed AWS CLI and InfluxDB. On the same EC2 instance we’ve installed AWS CLI and InfluxDB we installed Grafana and developed time series visualizations and dashboards. Another use case can be triggered when we use the InfluxDB data to create time series data sets (as CSV) and store them in S3 to train and deploy an anomaly detection ML model using Amazon SageMaker.

The architecture diagram above consists of simulated edge IoT devices that publish JSON data to AWS IoT Core. An AWS Lambda function gets triggered and inserts IoT data into an InfluxDB database instance which is used to develop dashboards with Grafana. An additional option is represented to trigger Lambda Step Functions to aggregate the data at particular time intervals (such as average at 5 minute intervals) and to insert the aggregated data into InfluxDB. The InfluxDB data can also be exported as CSV files into Amazon S3 buckets and be the data source to perform anomaly detection or other ML modeling with Amazon SageMaker.

This blog details 4 steps to publish IoT data to a time series database and develop real time dashboards.

  1. Set up the AWS IoT Device Simulator;
  2. Set up InfluxDB and Grafana on your own Amazon EC2 instances;
  3. Set up AWS IoT Core resources and AWS Lambda function needed to populate the time series database;
  4. Develop Grafana dashboard(s) with real-time visualizations to track IoT data.

Step 1. Set up the AWS IoT Device Simulator

The first step is to deploy the AWS IoT Device Simulator on the target AWS account. The AWS CloudFormation template deployment guide can be found at this link. It may take up to 15 minutes for all resources required to be provisioned. After we deploy the AWS IoT Device Simulator we create a new device type called PressureDevice. The pressure device shows up as a device type in the AWS IoT Device Simulator.

We add a few message attributes: pressure, viscosity, sensordatetime, deviceid, and clientid for this device type as well as the data frequency.

The configuration settings for each attribute are listed below. Please note that each device and attribute receives an internal ID separate from any other IDs we create.

{
  "name": "pressure",
  "_id_": "wzCHpAvdm",
  "min": 500,
  "type": "int",
  "max": 1500
}

{
  "name": "viscosity",
  "_id_": "NJJXwHTdW",
  "min": 25,
  "type": "int",
  "max": 100
}

{
  "name": "sensordatetime",
  "_id_": "QyKD1oCtd",
  "tsformat": "default",
  "type": "timestamp"
}

{
  "name": "deviceid",
  "_id_": "W4uk2jVHX",
  "static": "false",
  "type": "shortid"
}

{
  "name": "clientid",
  "_id_": "nXHjO4oTL",
  "static": true,
  "type": "uuid"
}

 

On the Widgets of the simulator browser section create 20 new instances of pressuresensor devices, which will actually start to publish data to AWS IoT Core (on the specified MQTT topic, pressure/data) as soon as they are started.

 

Let’s navigate to the AWS console and go to AWS IoT Core and test to inspect the data we are receiving from our simulated devices, by going to the Test menu, then subscribing to the same topic we have specified (pressure/data).

We can see how the data arrives from the device simulator to the AWS IoT Core.

Step 2. Set up InfluxDB and Grafana on an EC2

The next step is to set up an Amazon EC2 instance in a private subnet of your VPC. For a step-by-step tutorial, please see this link. You may also elect to install InfluxDB and Grafana on a stand-alone Amazon EC2 instance, i.e. not part of a VPC. When you set up your VPC or Amazon EC2 instance, add port 3000 to your security group for inbound access. For our blog purposes, select a t2.micro instance type and a Ubuntu distribution. SSH into the Amazon EC2 instance created and install InfluxDB and Grafana.

To install InfluxDB run the following commands.

wget https://dl.influxdata.com/influxdb/releases/influxdb_1.7.7_amd64.deb
 
sudo dpkg -i influxdb_1.7.7_amd64.deb

After the installation completes, start the InfluxDB engine.

sudo service influxdb start

You can validate the InfluxDB engine is running correctly by interacting with InfluxDB CLI.

influx

To exit the InfluxDB CLI, just type quit.

quit

The following steps installs Grafana on the same EC2 instance. In production environments you may need to install it on a separate EC2 instance on the same subnet.

wget https://dl.grafana.com/oss/release/grafana_6.2.5_amd64.deb 
sudo apt-get update
sudo apt-get install libfontconfig1
sudo apt --fix-broken install

sudo dpkg -i grafana_6.2.5_amd64.deb

Once both InfluxDB and Grafana are setup let’s create a database and table. Create a new database and a new user using following syntax, and use the quit command to exit the database instance.

influx

create database awsblog
 
create user awsblog with password 'YourPassword'

quit 

To complete our custom installs, let’s add Telegraf (this is a plugin-driven server agent for collecting and reporting metrics).

Note: you may have to add into the repository influxData repository, to add these further instructions can be found here.

sudo apt install telegraf -y

Let’s start it and enable it.

sudo systemctl start telegraf
 
sudo systemctl enable telegraf

Validate that Telegraf its running:

sudo systemctl status telegraf

Let’s edit and save the basic configuration in the file /etc/telegraf/telegraf.conf. Look for the following section in the file, and add after the [[outputs.influxdb]]

Note: you may have to use sudo to save the file, this depends on you OS and user setup. With nano, CTRL-O saves the file to disk and CTRL+X exits the nano editor.

sudo nano /etc/telegraf/telegraf.conf

###############################################################################
# OUTPUT PLUGINS #
###############################################################################
# Configuration for sending metrics to InfluxDB
[[outputs.influxdb]]

database = "awsblog"
username = "awsblog"
password = "YourPassword"

The final step for our Amazon EC2 instance is to setup Grafana (the graphics engine) to use the InfluxDB as a datasource. To do so, access the Grafana UI via the following URI.

http://<EC2-PublicDNS>:3000/

Change the default username and password from admin/admin and setup the data source. Ensure that the URL that points to the InfluxDB instance has the port number of the database instance (at the end of the URL make sure you have :8086).

We are now ready to test a few record inserts into our InfluxDB using the EC2 console.

influx

use awsblog
 
INSERT pressure,sensor=client001sensor01 value=1001,viscosity=34
INSERT pressure,sensor=client002sensor01 value=2101,viscosity=37
INSERT pressure,sensor=client003sensor01 value=0901,viscosity=38
INSERT pressure,sensor=client004sensor01 value=1201,viscosity=39
INSERT pressure,sensor=client005sensor01 value=1101,viscosity=60

quit

The next few steps will enable the solution to get the sensor data from our AWS IoT rule and insert the data into the InfluxDB using a Lambda function.

Step 3. Set up a Lambda function and AWS IoT core resources

Setting up a Lambda function

Let’s create a new Lambda function in Node.js 10.x, and paste the code below. We call it blogLambda2InfluxDB.

Code for index.js

const Influx = require('influx');

//This code writes data from IoT core rule via Lambda into InfluxDB 

exports.handler = async (event,context,callback) => {

    var pressureInputValue = JSON.parse(event.pressure);
    var viscosityInputValue = JSON.parse(event.viscosity);
    //Create clientID
    var clientid = JSON.stringify(event.clientid);
    var deviceid = JSON.stringify(event.deviceid);
    var sensorInputName = deviceid+clientid; 

    //var sensordatetime = JSON.stringify(event.sensordatetime);
    
    var result = writeToInfluxDB (pressureInputValue, viscosityInputValue,sensorInputName);
    
    callback(null, result);

  };

function writeToInfluxDB(pressureVar, viscosityVar,sensorVar)
{
    console.log("Executing Iflux insert");

    const client = new Influx.InfluxDB({
        database: process.env.INFLUXDB,
        username: process.env.INFLUXDBUSRNAME,
        password: process.env.INFLUXDBPWD,
        port: process.env.INFLUXDBPORT,
        hosts: [{ host: process.env.INFLUXDBHOST }],
        schema: [{
            measurement: 'pressure',
    
            fields: {
                pressureValue: Influx.FieldType.FLOAT, 
                viscosity: Influx.FieldType.FLOAT,
            },
    
            tags: ['sensorID']
        }]
    });
    
    client.writePoints([{
        measurement: 'pressure', fields: { pressureValue: pressureVar, viscosity: viscosityVar, },
        tags: { sensorID: sensorVar}
    }]) 
    console.log("Finished executing");
}    

Your Lambda console should look similar to the image below.

Then, let’s add a few environment variables as below (substitute the relevant values with your own).

And now Please notice how the Lambda function uses the environment variables we set up to perform the InfluxDB inserts. Select the execution role, VPC, subnets, and security group belonging to the same values as on your Amazon EC2 instance (InfluxDB server).

Let’s create and execution role for Lambda by following the step guideline found here (look under heading Execution Role and User Permissions)

For testing purposes you can set up your local environment to have access to influx utilities.

  • Create lambda on your laptop (where AWS CLI is installed) or EC2 with the following commands.
npm init
npm install —save influx
  • Use directions from this link and paste the AWS Lambda function code (see above Step 3 section Code for index.js) into the index.js file and zip it up and upload it into your AWS Lambda console.

SET UP AWS IoT Core resources

Navigate to the AWS IoT console and then to the Act menu. Let’s create a new rule called awsblog IoT Rule with an action to invoke a AWS Lambda function, to match the statement below.

SELECT pressure AS pressure, viscosity as viscosity, sensordatetime as sensordatetime, deviceid as deviceid, clientid as clientid FROM 'pressure/data'
 
       

Then select our AWS Lambda function: blogLambda2InfluxDB which we created above.

Step 4. Develop real-time Grafana dashboards

After the time series data from our simulated devices starts to stream into the InfluxDB instance we can start developing visualizations using Grafana. To access the Grafana web interface, log on to your Amazon EC2 instance that hosts Grafana, and navigate to http://ec2-xxx-xxx-xxx-xxx.compute-1.amazonaws.com:3000/login, where the xxx-xxx-xxx-xxx is the elastic IP address of your Amazon EC2 instance. You need to recall the Grafana username and password you created during Step 2 above.

Then click on the plus sign to create a dashboard, called Pressure and Viscosity.

Then add a new panel with the setting below. Rename it as Timeseries and Moving Average (by selecting the appropriate metrics space and aggregations). Your screen should look as below..

You can add more visualization panels to create a dashboard like the one shown below, but yours may look different based on the panels and visualization types you decide to add.

Conclusion

Visualizing time series data with AWS IoT Core, InfluxDB, and Grafana provide an effective architecture to generate, collect, persist, and develop rich real-time dashboards with IoT generated data. Even though in this blog we have used simulated devices to generate data, you should be able to replicate the blog fairly quickly with real IoT devices, such as vibration, pressure, and temperature sensors.

In this blog we have demonstrated how you can use AWS IoT Core to ingest data in real-time and how you can visualize this data using an open source database engine (InfluxDB) and an open source visualizations platform (Grafana). In real life IoT applications you will most likely encounter time series data. Visualizing time series data is one of the basic requirements of both realizing the value of IoT and gaining an understanding of the data generated by IoT sensors.

This blog shows how several basic components and native AWS features allow you to simplify both IoT application development and delivery of time series data for highly customizable visualizations. From AWS IoT Core to the rules engine and from AWS Lambda to InfluxDB with Grafana you can have the solution set-up and deployed with minimal lines of code. In addition, once the data is streamed into AWS IoT Core the ease with which you can develop and customize your own dashboards will enable your teams to iterate quickly to achieve the desired levels of data driven insights.

 

About the authors

 Syed Rehan is a Sr. Specialist Solutions Architect at Amazon Web Services and is based in London and supports customers as expert IoT Solution Architect. Syed has in-depth knowledge of IoT and works in this role with customers ranging from startup to enterprises to enable them to build IoT solutions with the AWS eco system.

 

 

 

Catalin Vieru is a Sr. AWS Architect specialized in IoT. He is based in Northern California and guides IoT applications architecture and development for numerous enterprises, from start-ups to established enterprises across the Western United States and Canada.