When data is all you need: An overview of IoT communication with the cloud

Technical and marketing teams working on internet of things (IoT) programs, sooner or later, manage a project that requires data flow between a fleet of devices and the cloud. This data is critical because marketing wants to provide more features to the users, business teams require data driven decisions, and technical teams work to optimize connectivity to an existing device fleet. All these reasons align around improving the customer experience. This blog post discusses the initial stages of an IoT project and some of the options that are available to communicate between the device and the cloud. It also provides concrete guidance about selecting the communication method based on your requirements and project constraints. This blog post presents communication alternatives for the IoT project, from well-known solutions to less standard approaches. It will help you select the appropriate communication service(s) for the project, and how to avoid some common mistakes that compromise cost, scope, and duration.

IoT device and device data

Before I started working on IoT projects, I had a device-centric view of IoT. The connected device is the key IoT component that interacts with the real world through sensors and actuators. However, it’s only one part of the solution – another part is the data. In some projects, the device data is all you need. For most IoT projects, the first technical discussion is often focused on how data will flow between the device and the cloud, and which communication protocols are needed. What communication protocols are needed for the solution? As usual, it depends. Through my experience of working on different projects, prototypes, and sectors, I’ve learned that you don’t have to use only one protocol. Selecting the appropriate communication protocols for each project can be a discovery journey. The key to identifying the protocol(s) is to break the discussion into the following four system constraints:

Device: What are the physical device constraints, such as memory, available communication interfaces, computational capacity, and power consumption?
Data: What are the different types of data collected on the device? How much data is collected (volume) for each type of data? Will the data flow bidirectional or unidirectional?
Cost: What’s the data transmission cost for each type of data? Is it worth the cost to have the data in the cloud as soon as possible?
Security: It’s not enough to send data from and to the device. Communication needs to be managed through a secure method that supports authentication, authorization, validation, and privacy policies. The security capabilities must be considered as foundational requirements during analysis and when selecting the communication protocol.

Note: Each communication protocol discussed in this post can implement different authentication mechanisms, such as X.509 certificates, custom authorizers, and federation.

The MQTT protocol

MQTT is a standard messaging protocol for IoT projects. MQTT is a bidirectional, lightweight, and scalable protocol. It’s also a high-level, application layer protocol (similar to HTTP, but with different characteristics) and extensively supported in many libraries and programming languages.

TCP/IP protocol stack diagram showing four layers (Application, Transport, Internet, Network Interface) with HTTP and MQTT protocols using TLS and TCP for secure communication.

HTTP – MQTT protocols in the OSI model

MQTT follows the publish-subscribe communication model, where the broker coordinates the communication with the clients. A basic MQTT message contains two main components: the topic, which is the hierarchical identification of what the message contains, and the payload, which can be provided in different formats, including JSON, binary, or text.

If the project requires a communication channel to send and receive messages between the device and the cloud, MQTT is well suited. With MQTT, you can send data or device status to the cloud and receive requests and messages from the cloud. While maintaining a simple and flexible design, MQTT offers native functionality that can simplify the software application. For example, an adequate topic level structure design enables an efficient control of the messages that a device can publish or receive. For more information, see Designing MQTT Topics for AWS IoT Core.

The AWS IoT Core service supports MQTT, MQTT5, and MQTT over WebSocket protocols. AWS IoT Core also acts as a MQTT broker and treats the devices as clients. AWS IoT Core functionality offers a wide range of additional key features and services. For example, it offers mechanisms to enable automate device provisioning and control static or dynamic groups of devices (jobs) based on their type, properties, and tags. AWS IoT Core also supports transitioning from single device operations to organizing and managing a device fleet.

Architecture diagram showing IoT devices connecting to AWS IoT Core via MQTT protocol with X.509 certificates, routing messages through topics and rules to trigger actions across multiple AWS services.

MQTT communication with AWS IoT Core

Data streams and MQTT

MQTT messages from the device typically contain device measurements, status, events, control data, or configuration data. The protocol is flexible enough to include one or multiple data payloads within the same message. For example, a message may include a single event. Alternately, the payload may be a JSON object that contains heterogeneous device measurements and device status at a specific time. There are other occasions where stream-based communication may be preferable to managing multiple messages. One common use case is related to data stored or cached locally on the device’s non-volatile memory. The device may send this data at regular intervals, or on-demand based on a request. Streams are also commonly used to send high volume of near real-time data. For example, sending raw measurement data across different devices for processing and analysis in the cloud.

AWS architecture diagram showing data flow from device applications through Amazon Kinesis services to cloud storage, analysis, and visualization layers.

Data or video streams

Amazon Kinesis services support data or video stream ingestion, processing, and analysis. A frequent use case is streaming data from the device to Amazon Kinesis Data Streams. For more information, see Best practices for ingesting data from devices using AWS IoT Core and/or Amazon Kinesis. These two communication channels are often used on the same device to cover different requirements to the communication with the cloud.

The message sending only pattern

Some projects require a lightweight, one-direction communication layer from the device to the cloud. It isn’t always feasible to establish bidirectional communication between the device and the cloud due to application, device, or project constraints. The communication layer could also be implemented this way because the system was developed by a third party and it may not be possible to add new functionality.

Bi-directional communication is commonly used when the device sends status updates or measurements, and the cloud responds with an acknowledgement. You can use different services to support this one directional pattern on IoT, such as AWS IoT Core, Amazon API Gateway, or AWS AppSync. Since this is a publish-only protocol, the device must poll for cloud data updates. This means features like device disconnection detection require extra implementation work, unlike in other protocols where these features are built in.

Architecture diagram showing how a device application communicates with AWS Cloud services through Amazon API Gateway, which routes HTTP requests to AWS Lambda, DynamoDB, Amazon SQS, and other HTTP endpoints, with bidirectional request and response flows.

Request-only using HTTP

When MQTT is not a feasible option, it’s possible to use the HTTPS protocol and the message response can be leveraged to receive data from the cloud. Once the data is in AWS, you can use more than 200 AWS managed services to process, analyze, and infuse intelligence to the data.

Receiving static data on the device

The device or the device fleet may need to read static, or semi-static, data from the cloud. For example, configuration settings or a software update. If the application already implements MQTT protocol, MQTT shadows is an efficient process to read relatively small static data, such as the configuration. For more information, see AWS IoT Core message broker and protocol limits and quotas.

Architecture diagram showing a device application reading data from Amazon S3 in AWS Cloud, with local data storage and identification components.

Reading from Amazon S3 bucket

For larger files, that might include a version number or status to indicate firmware updates, you can download the data directly from Amazon Simple Storage Service (Amazon S3) .

Two sequence diagrams showing push-based and pull-based data synchronization patterns between IoT devices and AWS cloud using Amazon S3.

Actively receiving data from S3: bidirectional vs unidirectional protocols

IoT projects without devices (a rare use case)

Working directly on IoT devices isn’t always feasible. Even though your goal may be to build an IoT cloud application that manages multiple devices, some constraints can render the situation more complex. For example, when:

Existing devices in the field can’t be updated or updating them requires too much development effort.
The current device communication features should not be changed as existing systems depend on them.
Third-party devices may be involved. This could include devices with proprietary control systems, proprietary communication protocols, or closed systems that your team can’t modify.

If your goal is to evaluate feasibility and an overview of the system, you should develop an IoT cloud infrastructure and application prototype. This can leverage existing device telemetry data and control functionality. You might consider two different strategies for this approach:

Implement a cloud-to-cloud communication solution.
Develop a wrapper on the existing devices APIs.

AWS Device Management Cloud architecture diagram showing three-tier integration between on-premise devices, Device Management Cloud with Control Application and Data storage, and AWS Cloud services connected through VPC peering

No device development: cloud to cloud communication.

Using cloud-to-cloud communication has the benefit of isolating the existing solution on the new development. You can also use a different application protocol to transfer device telemetry data and allows you to control the data. You might leverage an Amazon Virtual Private Cloud (Amazon VPC) to establish a virtual network between existing and new applications. Using this communication method can be very efficient. For example, receiving measurements and states for a group of devices. The drawback is that an Amazon VPC requires additional effort to manage the devices. If the devices are third-party, it requires co-development effort, which can be a blocker.

No device development: leverage existing communications

A second option is to develop a wrapper and leverage the already available APIs from the external system by using Amazon API Gateway. A typical use case is when communicating to a REST or WebSocket API. For third-party APIs, you can consider security protections that limit the number of requests per second, minute, or day. These are some constraints to be aware of because it can limit your scalability.

Conclusion

One of the strengths of IoT is its communication, data storage, and its ability to make decisions at the edge. One approach to IoT projects is to start from the device, the thing, and then design based on the device capabilities. In this blog we explored a different approach that is based on a data-centric model. Focusing on data first helps you to design more cost-effective solutions You can also obtain this data using different communication protocols and provide a solution that aligns to your project objectives and constraints.

References

[ 1 ] https://aws.amazon.com/what-is/mqtt/
[ 2 ] https://docs.aws.amazon.com/pdfs/whitepapers/latest/designing-mqtt-topics-aws-iot-core/designing-mqtt-topics-aws-iot-core.pdf
[ 3 ] https://aws.amazon.com/blogs/iot/best-practices-for-ingesting-data-from-devices-using-aws-iot-core-and-or-amazon-kinesis/
[ 4 ] https://docs.aws.amazon.com/iot/latest/developerguide/iot-device-shadows.html
[ 5 ] https://docs.aws.amazon.com/general/latest/gr/iot-core.html#message-broker-limits

About the authors

Alfonso Torres Soto is an Industrial Engineer (MS) and Project Manager (PMP). He works as Solutions Architect at AWS helping customers bring their ideas to reality. He is passionate about both technology and philosophy.

The Internet of Things on AWS – Official Blog

When data is all you need: An overview of IoT communication with the cloud

References

About the authors

Resources

Follow

Learn

Resources

Developers

Help