AWS Architecture Blog

Creating scalable architectures with AWS IoT Greengrass stream manager

Designing a scalable, global, real-time, distributed system to process millions of messages from a variety of critical devices can complicate architectures. Collecting large data streams or image recognition from the edge also requires scalable solutions.

AWS IoT Core is designed to handle large numbers of Internet of things (IoT) devices sending a few messages per second. However, when IoT devices send large numbers of messages per second and processing occurs at the edge, managing large data streams is challenging. Further, data that is buffered or processed at the edge can increase latency.

This post describes how to create a scalable IoT architecture with AWS IoT Greengrass stream manager that can handle thousands of critical messages per second from a variety of IoT devices.

Relaying critical messages from high-throughput IoT devices

Let’s explore an example. Consider the architecture in Figure 1, where you have two types of IoT devices sending messages to AWS IoT Core. One set of IoT devices sends thousands of messages per second while the other set of devices sends tens of messages per second.

The IoT devices that are sending thousands of messages per second contain critical data that cannot be lost and must be processed at the edge. The IoT devices sending tens of messages per second are of less importance.

IoT devices sending messages to AWS IoT Core

Figure 1. IoT devices sending messages to AWS IoT Core

In considering this architecture, the critical IoT devices sending thousands of messages per second take the following path:

  1. IoT devices send data to an AWS IoT Greengrass component for processing with a Quality of Service (QoS) of 0.
  2. The AWS IoT Greengrass component processes the data and sends it to the AWS IoT Greengrass message broker.
  3. AWS IoT Greengrass message broker relays the data to AWS IoT Core with a QoS of 1.
  4. AWS IoT Core sends the data to Amazon Kinesis Data Streams for further processing.

In contrast, the IoT devices that send a few messages per second take the following path:

  1. IoT devices send data to the AWS IoT Greengrass message broker.
  2. The AWS IoT Greengrass message broker relays the data to AWS IoT Core.
  3. AWS IoT Core sends the data to Kinesis Data Streams for further processing.

Due to the critical nature of the messages being sent, the AWS IoT Greengrass message broker is configured to send messages to the AWS IoT Core with a QoS of 1. However, when you configure QoS to 1, the AWS IoT Greengrass message broker has to wait for an acknowledgement (ACK) before sending more data.

As you add more IoT devices, many choose to batch the messages before sending them to AWS IoT Core. This can be a good strategy when you are dealing with many IoT devices that send a small number of messages per second. But when you are adding IoT devices that send thousands of messages per second, the time waiting for an ACK can add latency and cause inconsistencies when reporting the data downstream.

This is because the AWS IoT Greengrass message broker is capable of sending 100 messages to AWS IoT Core before waiting for an ACK when QoS is set to 1. As a result, scaling this architecture to handle additional IoT devices can become challenging.

For more information about the AWS IoT Greengrass message broker’s limits, refer to the AWS IoT Core message broker and protocol limits and quotas section of the AWS General Reference Guide.

AWS IoT Greengrass stream manager for speed and reliability

To scale this type of architecture, you can use a pre-built AWS IoT Greengrass component called AWS IoT Greengrass stream manager to bypass the Greengrass message broker and AWS IoT Core to send your data directly to AWS IoT Analytics, AWS IoT SiteWise, Amazon Kinesis, or Amazon Simple Storage Service (Amazon S3).

For example, consider the earlier scenario where one set of IoT devices is sending thousands of critical messages per second and another set is sending data of less importance.

Instead, you can use AWS IoT Greengrass stream manager to create an architecture that can easily and reliably send large amounts of data from the edge directly to Kinesis, as in Figure 2.

AWS IoT Greengrass stream manager sending data directly to Kinesis

Figure 2. AWS IoT Greengrass stream manager sending data directly to Kinesis

As opposed to the Figure 1 configuration, the critical IoT devices that send thousands of messages per second can now take the following path:

  1. Critical IoT devices send data to an AWS IoT Greengrass component for processing.
  2. The AWS IoT Greengrass component processes the data and sends it to AWS IoT Core stream manager.
  3. AWS IoT Greengrass stream manager sends the data directly Amazon Kinesis Data Streams.

Note that the IoT devices sending a few messages per second can also be sent to AWS IoT Greengrass stream manager at a lower priority. You are still using AWS IoT Core, but it is no longer the main data path. By continuing to use AWS IoT Core, you can benefit from its control plane features such as managing updates, certificates, and policies. However, AWS IoT Core’s data plane features—like Rules Engine—are no longer used in this architecture, which can help reduce costs. If you choose to bypass the AWS IoT Greengrass message broker and use AWS IoT Core stream manager, any components that you have built must be moved so that processing occurs at the edge.

In the architecture in Figure 2, AWS IoT Greengrass stream manager is being used to bypass the main data path away from the AWS IoT Greengrass message broker and AWS IoT Core. Bypassing these services reduces the latency in Figure 1 caused by the AWS IoT Greengrass message broker waiting for an ACK from AWS IoT Core.

AWS IoT Greengrass stream manager can handle thousands of messages per second so you can:

  • Reliably scale your architecture
  • Create multiple data paths to send both critical and non-critical data to AWS IoT Core stream manager while still leveraging AWS IoT Core control plane features
  • Prioritize your critical data paths to have specific IoT devices take higher priority
  • Create configurations to handle situations where you have IoT devices with limited or intermittent connectivity. (For example, you can create configurations to use local storage or memory to cache your data when internet connectivity is lost, and then flush data when an ACK is received from the destination.)

All of these features can help you reduce latency, costs, data inconsistencies, and the potential loss of critical data. They also provide a mechanism to scale the number of devices that your architecture can reliably handle.

Get started with stream manager by using the AWS IoT Greengrass Core SDK or the AWS IoT console.

Conclusion

In this blog post, we discussed how to create a scalable IoT architecture that can handle thousands of critical messages per second from a variety of IoT devices. Incorporating AWS IoT Greengrass stream manager into your architecture can help reduce latency, data inconsistencies, and the potential loss of critical data by providing a way to bypass AWS IoT Core and send large amounts of data efficiently and reliably.

Neil Mehta

Neil Mehta

Neil Mehta is an AWS Solutions Architect on the SMB Greenfield team. He has a passion for helping customers build scalable solutions that are tailored to their specific needs. In his spare time, Neil enjoys spending time with his family and rooting for his local Washington, DC sports teams.