AWS Architecture Blog

Field Notes: Ingest and Visualize Your Flat-file IoT Data with AWS IoT Services

Customers who maintain manufacturing facilities often find it challenging to ingest, centralize, and visualize IoT data that is emitted in flat-file format from their factory equipment. While modern IoT-enabled industrial devices can communicate over standard protocols like MQTT, there are still some legacy devices that generate useful data but are only capable of writing it locally to a flat file. This results in siloed data that is either analyzed in a vacuum without the broader context, or it is not available to business users to be analyzed at all.

AWS provides a suite of IoT and Edge services that can be used to solve this problem. In this blog, I walk you through one method of leveraging these services to ingest hard-to-reach data into the AWS cloud and extract business value from it.

Overview of solution

This solution provides a working example of an edge device running AWS IoT Greengrass with an AWS Lambda function that watches a Samba file share for new .csv files (presumably containing device or assembly line data). When it finds a new file, it will transform it to JSON format and write it to AWS IoT Core. The data is then sent to AWS IoT Analytics for processing and storage, and Amazon QuickSight is used to visualize and gain insights from the data.

Samba file share solution diagram

Since we don’t have an actual on-premises environment to use for this walkthrough, we’ll simulate pieces of it:

  • In place of the legacy factory equipment, an EC2 instance running Windows Server 2019 will generate data in .csv format and write it to the Samba file share.
    • We’re using a Windows Server for this function to demonstrate that the solution is platform-agnostic. As long as the flat file is written to a file share, AWS IoT Greengrass can ingest it.
  • An EC2 instance running Amazon Linux will act as the edge device and will host AWS IoT Greengrass Core and the Samba share.
    • In the real world, these could be two separate devices, and the device running AWS IoT Greengrass could be as small as a Raspberry Pi.

Prerequisites

For this walkthrough, you should have the following prerequisites:

  • An AWS Account
  • Access to provision and delete AWS resources
  • Basic knowledge of Windows and Linux server administration
  • If you’re unfamiliar with AWS IoT Greengrass concepts like Subscriptions and Cores, review the AWS IoT Greengrass documentation for a detailed description.

Walkthrough

First, we’ll show you the steps to launch the AWS IoT Greengrass resources using AWS CloudFormation. The AWS CloudFormation template is derived from the template provided in this blog post. Review the post for a detailed description of the template and its various options.

  1. Create a key pair. This will be used to access the EC2 instances created by the CloudFormation template in the next step.
  2. Launch a new AWS CloudFormation stack in the N. Virginia (us-east-1) Region using iot-cfn.yml, which represents the simulated environment described in the preceding bullet.

    •  Parameters:
      • Name the stack IoTGreengrass.
      • For EC2KeyPairName, select the EC2 key pair you just created from the drop-down menu.
      • For SecurityAccessCIDR, use your public IP with a /32 CIDR (i.e. 1.1.1.1/32).
      • You can also accept the default of 0.0.0.0/0 if you can have SSH and RDP open to all sources on the EC2 instances in this demo environment.
      • Accept the defaults for the remaining parameters.
  •  View the Resources tab after stack creation completes. The stack creates the following resources:
    • A VPC with two subnets, two route tables with routes, an internet gateway, and a security group.
    • Two EC2 instances, one running Amazon Linux and the other running Windows Server 2019.
    • An IAM role, policy, and instance profile for the Amazon Linux instance.
    • A Lambda function called GGSampleFunction, which we’ll update with code to parse our flat-files with AWS IoT Greengrass in a later step.
    • An AWS IoT Greengrass Group, Subscription, and Core.
    • Other supporting objects and custom resource types.
  • View the Outputs tab and copy the IPs somewhere easy to retrieve. You’ll need them for multiple provisioning steps below.

3. Review the AWS IoT Greengrass resources created on your behalf by CloudFormation:

    • Search for IoT Greengrass in the Services drop-down menu and select it.
    • Click Manage your Groups.
    • Click file_ingestion.
    • Navigate through the SubscriptionsCores, and other tabs to review the configurations.

Leveraging a device running AWS IoT Greengrass at the edge, we can now interact with flat-file data that was previously difficult to collect, centralize, aggregate, and analyze.

Set up the Samba file share

Now, we set up the Samba file share where we will write our flat-file data. In our demo environment, we’re creating the file share on the same server that runs the Greengrass software. In the real world, this file share could be hosted elsewhere as long as the device that runs Greengrass can access it via the network.

  • Follow the instructions in setup_file_share.md to set up the Samba file share on the AWS IoT Greengrass EC2 instance.
  • Keep your terminal window open. You’ll need it again for a later step.

Configure Lambda Function for AWS IoT Greengrass

AWS IoT Greengrass provides a Lambda runtime environment for user-defined code that you author in AWS Lambda. Lambda functions that are deployed to an AWS IoT Greengrass Core run in the Core’s local Lambda runtime. In this example, we update the Lambda function created by CloudFormation with code that watches for new files on the Samba share, parses them, and writes the data to an MQTT topic.

  1. Update the Lambda function:
    • Search for Lambda in the Services drop-down menu and select it.
    • Select the file_ingestion_lambda function.
    • From the Function code pane, click Actions then Upload a .zip file.
    • Upload the provided zip file containing the Lambda code.
    • Select Actions > Publish new version > Publish.

2. Update the Lambda Alias to point to the new version.

    • Select the Version: X drop-down (“X” being the latest version number).
    • Choose the Aliases tab and select gg_file_ingestion.
    • Scroll down to Alias configuration and select Edit.
    • Choose the newest version number and click Save.
    • Do NOT use $LATEST as it is not supported by AWS IoT Greengrass.

3. Associate the Lambda function with AWS IoT Greengrass.

    • Search for IoT Greengrass in the Services drop-down menu and select it.
    • Select Groups and choose file_ingestion.
    • Select Lambdas > Add Lambda.
    • Click Use existing Lambda.
    • Select file_ingestion_lambda > Next.
    • Select Alias: gg_file_ingestion > Finish.
    • You should now see your Lambda associated with the AWS IoT Greengrass group.
    • Still on the Lambda function tab, click the ellipsis and choose Edit configuration.
    • Change the following Lambda settings then click Update:
      • Set Containerization to No container (always).
      • Set Timeout to 25 seconds (or longer if you have large files to process).
      • Set Lambda lifecycle to Make this function long-lived and keep it running indefinitely.

Deploy AWS IoT Greengrass Group

  1. Restart the AWS IoT Greengrass daemon:
    • A daemon restart is required after changing containerization settings. Run the following commands on the Greengrass instance to restart the AWS IoT Greengrass daemon:
 cd /greengrass/ggc/core/

 sudo ./greengrassd stop

 sudo ./greengrassd start

2. Deploy the AWS IoT Greengrass Group to the Core device.

    • Return to the file_ingestion AWS IoT Greengrass Group in the console.
    • Select Actions Deploy.
    • Select Automatic detection.
    • After a few minutes, you should see a Status of Successfully completed. If the deployment fails, check the logs, fix the issues, and deploy again.

Generate test data

You can now generate test data that is ingested by AWS IoT Greengrass, written to AWS IoT Core, and then sent to AWS IoT Analytics and visualized by Amazon QuickSight.

  1. Follow the instructions in generate_test_data.md to generate the test data.
  2. Verify that the data is being written to AWS IoT Core following these instructions (Use iot/data for the MQTT Subscription Topic instead of hello/world).

screenshot

Setup AWS IoT Analytics

Now that our data is in IoT Cloud, it only takes a few clicks to configure AWS IoT Analytics to process, store, and analyze our data.

  1. Search for IoT Analytics in the Services drop-down menu and select it.
  2. Set Resources prefix to file_ingestion and Topic to iot/data. Click Quick Create.
  3. Populate the data set by selecting Data sets > file_ingestion_dataset >Actions > Run now. If you don’t get data on the first run, you may need to wait a couple of minutes and run it again.

Visualize the Data from AWS IoT Analytics in Amazon QuickSight

We can now use Amazon QuickSight to visualize the IoT data in our AWS IoT Analytics data set.

  1. Search for QuickSight in the Services drop-down menu and select it.
  2. If your account is not signed up for QuickSight yet, follow these instructions to sign up (use Standard Edition for this demo)
  3. Build a new report:
    • Click New analysis > New dataset.
    • Select AWS IoT Analytics.
    • Set Data source name to iot-file-ingestion and select file_ingestion_dataset. Click Create data source.
    • Click Visualize. Wait a moment while your rows are imported into SPICE.
    • You can now drag and drop data fields onto field wells. Review the QuickSight documentation for detailed instructions on creating visuals.
    • Following is an example of a QuickSight dashboard you can build using the demo data we generated in this walkthrough.

Cleaning up

Be sure to clean up the objects you created to avoid ongoing charges to your account.

  • In Amazon QuickSight, Cancel your subscription.
  • In AWS IoT Analytics, delete the datastore, channel, pipeline, data set, role, and topic rule you created.
  • In CloudFormation, delete the IoTGreengrass stack.
  • In Amazon CloudWatch, delete the log files associated with this solution.

Conclusion

Gaining valuable insights from device data that was once out of reach is now possible thanks to AWS’s suite of IoT services. In this walkthrough, we collected and transformed flat-file data at the edge and sent it to IoT Cloud using AWS IoT Greengrass. We then used AWS IoT Analytics to process, store, and analyze that data, and we built an intelligent dashboard to visualize and gain insights from the data using Amazon QuickSight. You can use this data to discover operational anomalies, enable better compliance reporting, monitor product quality, and many other use cases.

For more information on AWS IoT services, check out the overviews, use cases, and case studies on our product page. If you’re new to IoT concepts, I’d highly encourage you to take our free Internet of Things Foundation Series training.

Field Notes provides hands-on technical guidance from AWS Solutions Architects, consultants, and technical account managers, based on their experiences in the field solving real-world business problems for customers.
TAGS:
Paul Ramsey

Paul Ramsey

Paul Ramsey is a Senior AWS Solutions Architect for Enterprise Accounts based out of Dallas, Texas. His interests and experience include Databases, Serverless Technology, and Containers. Outside of work, you can find Paul playing his guitar or piano, binge-watching a new show on Amazon Prime, or playing chess.

Ronnie Eichler

Ronnie Eichler

Ronnie Eichler is a Senior Global Solutions Architect at AWS in Dallas, Texas.