The Internet of Things on AWS – Official Blog

Containerize your IOT application with AWS IOT Analytics

Overview

In an earlier blog post about IoT Analytics, we discussed how AWS IoT Analytics enables you to collect, visualize, process, query and store large amounts of time series data generated from your connected devices. In this blog post, I will show how you can use your custom application to interact with AWS IoT Analytics. Specifically, we will discuss –

  • How to bring your custom code as a docker image to AWS IoT Analytics
  • How the docker based application can process batch data on a schedule

Scenario

We’ll use an example of a smart building in NYC that uses AWS IoT to become energy efficient. The building is equipped with sensor enabled conference rooms , phone booths and offices. The telemetry data from various sensors across different floors & rooms are analyzed in certain intervals against the corporate meeting calendar data to identify if the rooms are in use. If not, the Lights and Air conditioner of the rooms are turned off.

This is how the flow works –

  1. The telemetry data from various sensors are published from the building through a device gateway to AWS IoT Core topic (for example : building/sensors) on the cloud.
  2. AWS IoT Core in turn uses Rules engine to route the data to data stores in AWS IoT Analytics.
  3. The data in the data store is analyzed by a containerized application integrated with AWS IoT Analytics. In this blog, we will use it to determine how many rooms have Light & AC turned on.
  4. The result set is then validated against the corporate calendar , to determine if any of the rooms are reserved for this time period or can energy be conserved by turning off the Lights or AC. And the output is stored in a landing zone for further actions.

Pre-Requisites

This blog assumes that the reader have completed the the following steps prior to moving to the Solution section :

  • Create a ssh key-pair to be able to log into the EC2 instance
    A ssh key-pair can be generated or imported in the AWS console under EC2 -> Key Pairs
  • Launch the cloudformation template to provision the environment for this lab –
  • Check if the telemetry data is being published once the cloudformation stack is successfully created
    • Go to IoT Core Console, Click Test on the left pane –
      • Subscription topic: workshop/telemetry
      • Click Subscribe to topic

Solution

Create SQL Data Set

A SQL data set is similar to a materialized view from a SQL database. We will create a SQL data set below that will store the telemetry data generated in the section above.

On the IoT Analytics console home page, in the left navigation pane, choose Analyze.

  • Click on Data sets → Create
  • Choose SQL Data Sets → Create SQL
  • Choose a ID for the SQL Data Set and select the datastore :
    • ID → mydataset, Data Store Source → mydatastore , click Next
  • Paste the below SQL into the Query window, click Next
SELECT floor_id,room_id,day,time FROM mydatastore where light = 1 and hvac = 1
  • Keep delta selection window as None (default) , click Next
  • Schedule the data set to run on every 15 minutes
    • Choose Minute of hour – 0 , Click Next
  • Choose the the default retention for the data set (Indefinitely) and Create Data set.

Query Data

Click on the Data Set you just created.

  • On the data set page, in the upper-right corner, choose Actions, and then choose Run now
  • It can take few minutes for the data set to show results. Check for SUCCEEDED under the name of the data set in the upper left-hand corner. The Content section contains the query results (left pane).

Create Custom Container for Analysis

Please follow the instructions below to create the docker image with custom application code –

1. SSH to EC2 instance (copy command from cloudformation output tab) & navigate to the docker directory:

cd ~/docker-setup
aws s3 cp ./calendar.csv s3://<paste-s3-bucket-name-from-cloudformation-output> 

2. Build the docker image:

docker build -t container-app-ia .

3. You should see a new image in your Docker repo. Verify it by running:

docker image ls | grep container-app-ia

4. Create a new repository in ECR:

aws ecr create-repository --repository-name container-app-ia

Please copy the repositoryUri from the output for use in step 7 & 8

5. Get the login to your Docker environment:

aws ecr get-login --no-include-email

6. Copy the output and run it. The output should look something like:

docker login -u AWS -p <password> https://<your-aws-account-id>.dkr.ecr.amazonaws.com

7. Tag the image you created with the ECR Repository Tag:

docker tag container-app-ia:latest <<paste repositoryUri copied earlier>>:latest

8. Push the image to ECR:

docker push <<paste repositoryUri copied earlier>>

Create the Container Data Set

A container data set allows you to automatically run your analysis tools and generate results. It brings together a SQL data set as input, a Docker container with your analysis tools and needed library files, input and output variables, and an optional schedule trigger. The input and output variables tell the executable image where to get the data and store the results.

On the IoT Analytics console home page, in the left navigation pane, choose Analyze.

  • Click on Data Sets → Create
  • Choose Container Data Sets → Create Container
  • Choose a unique ID for the Container Data Set → container_dataset, click Next
  • Choose the option → Link an existing data set’s query → Link
  • Select a trigger for your analysis → Choose mydataset → Schedule will be automatically populated, click Next
  • Select from your ECR Repository → Choose the repository container-app-ia (as per screenshot below)
  • Select your image → Choose the image with latest tag

 

  • Configure the input variables (as below) → Click Next :
Name Type Value
datasetv Content version mydataset
resulturi Output file output.csv
inputDataS3BucketName String <<paste s3 bucket name from cloudformation output>>
inputDataS3Key String calendar.csv
  • Select a Role → Choose the IAM Role → search & select iotAContainerRole
  • Configure the capacity for container :
    • Compute Resource : 4 vCPUs and 16 GiB Memory
    • Volume size (GB) : 1
  • Configure the retention of your results → Keep its default (indefinitely) and Click on Create Data set

Query & Validate Container Data Set

On the IoT Analytics console home page, in the left navigation pane, choose Analyze.

  • Select Data Set → container_dataset
  • On the data set page, in the upper-right corner, choose Actions, and then choose Run now
  • It can take few minutes for the data set to show results. Check for SUCCEEDED under the name of the data set in the upper left-hand corner. Check if the output file exists under Content tab (left pane) , or in your S3 bucket and download it.

The output data once downloaded should be similar to below –

Thus you completed the workflow to ingest telemetry data from your smart building, enrich the data using your custom docker code to gain insight into the availability of rooms, and store the processed data into a landing zone for further actions like turning off the lights & AC in the free rooms .

Clean Up

Please follow the instructions below to clean-up the resources created part of this blog –

  • SSH to EC2 instance & navigate to the clean-up directory:
cd ~/clean-up
./clean-up.sh
Enter name of the device > smart-building
Enter device type > sensors
Enter S3 bucket > <<paste your s3 bucket name>>
  • Navigate to AWS Console -> Choose Cloudformation -> Select and Delete the stack created earlier
  • Navigate to AWS Console -> Choose ECS -> Select Repositories (left pane) -> Delete the ECR repository “container-app-ia”

Troubleshooting

  • If SQL Data Set execution fails, run the below to check the error for the respective version –
    • SSH to EC2 instance and run the below :
aws iotanalytics list-dataset-contents --dataset-name mydataset
  • If Container Data Set execution fails, you can check the the logs in –
    • Cloudwatch -> Log Groups → /aws/sagemaker/TrainingJobs
  • For any other issues, please refer to here – Link

Conclusion

AWS IoT Analytics enabled you to use your custom code in a container to analyze, process and enrich sensor data with calendar data. Now you have the output data files in Amazon S3 where you can perform analytics / visualization / device shadow updates. A common pattern is to trigger a AWS Lambda function once a file is uploaded to Amazon S3. AWS Lambda function can analyse the file to identify the free rooms and update the device shadow for the respective devices registered in AWS IoT Core , that can turn off the Light & AC for those rooms in the building. I hope you found the information in this post helpful. Please feel free to leave questions or other feedback in the forum.