The Internet of Things on AWS – Official Blog

How SysAid manages agents behind restricted firewall rules with AWS IoT Core

This blog post will outline how SysAid uses AWS IoT Core and the MQTT over WebSocket Secure communication protocol at scale for managing remote software agents and overcoming restricted firewall rules securely.

SysAid is a global Software as a service (SaaS) automation company that provides IT Service Management (ITSM) and Asset Management solutions, which serve thousands of customers. Sysaid provides software for IT teams to control all aspects of service management.

Introduction

SysAid software agents are installed at the customer’s site. Agents collect telemetry and status from IT resources such as computers and printers and relay it to the SysAid SaaS service running on AWS. Occasionally, the SaaS backend needs to reach the agents, instructing them to take specific actions, such as configuration changes.

To enable communication, the simplest solution is for the agents to initiate connection with the cloud on specific allowed IPs and ports, periodically polling for the latest instructions. However, this approach generates a lot of network traffic.

As an example, if an agent polls once per second, that’s 86,400 requests a day. Over thousands of customers, that can easily come to billions of requests a month or more. Additionally, since over 95% of the time the server has no message waiting for the agent, most of this traffic is redundant and unnecessary.

Furthermore, corporate firewalls often restrict inbound and outbound traffic to be transmitted over a small range of TCP ports. This is done as a security measure to limit the attack surface for possible cyber-attacks. Standard ports for protocols like HTTPS traffic (port 443) are left open, but others that are used for less common protocols, such as MQTT (port 8883) may be intentionally blocked. If you are manufacturing IoT devices that will ultimately be used in IT environments that you do not control, this can cause serious headaches to negotiate separately with each customer IT department to open port 8883 in their firewall so that your devices can connect to your IoT application running on AWS IoT Core.

Although AWS IoT Core supports MQTT with TLS client auth‏entication on port 443, ‏some SysAid clients only allow outgoing traffic to specific IP addresses. As AWS IoT Core endpoints will resolve to continuously changing IP address ranges over time, SysAid needed a solution, otherwise agents would not be able to connect behind the customer’s firewall.

The following principles were critical for the solution:

  • Reduce agents’ traffic by avoiding empty poll response.
  • Support a large scale of tens of thousands of agents and billions of messages.
  • Encrypt traffic at transit.
  • Ability to recover from disconnects.
  • Ability to authenticate & authorize agents, allowing them to receive only the messages intended specifically for them.
  • Be fully managed. Avoid undifferentiated heavy lifting of managing infrastructure.

Solution overview

SysAid chose AWS IoT Core for its solution as it allows secured connectivity with any number of devices to the cloud and to other devices without requiring the provisioning or management of servers.

With AWS IoT Core, they can manage authorization of devices and provision unique identities at scale. Furthermore, its Message Broker feature enables reliable and fast MQTT communication across SysAid agents’ fleets.

Using MQTT’s publisher/subscriber model allows SysAid to avoid redundant polling. Instead SysAid’s servers send messages to the agent only when needed, drastically reducing traffic.

By using a topic structure like:

customer-id/device-id/message-subject

messages can be sent directly to the agent customer-a in account customer-b. So, notifying a configuration change can be done by sending message to topic:

customer-b/computer-a/configuration-changes

The agent on computer-a can receive all messages directed to it by subscribing to a topic filter like:

customer-b/computer-a/#

The topic filters wildcard can be used by the agent to subscribe to multiple configuration topics. This should be handled with care to avoid overloading the agent if it cannot process all incoming messages.

But, devices are not always guaranteed to be connected. Sometimes backend services will send configuration changes to the device topic, but the device, being offline, will not be able to accept it.

AWS IoT Core has two features which help overcome device connectivity issues:

  • MQTT retained messages for AWS IoT Core – This feature allows you to store a single message per a given MQTT topic for delivery to any current and future topic subscribers.
  • AWS IoT Device Shadow service – Shadows provide a reliable data store for devices, apps, and other cloud services to share data. They enable devices, apps, and other cloud services to connect and disconnect without losing a device’s state.

Using retained messages, SysAid agents are able to receive their initial configuration message when re-subscribing to a topic after disconnection.

And how does this improve security?

The security model is simple. AWS IoT provides a thing registry that helps you manage things. A thing is a representation of a specific device or logical entity. Every device connected to AWS IoT has a thing representation on the thing registry.

For a device to be able to authenticate using an x.509 certificate, the certificate must be registered and associated with an IoT policy.

The IoT policy sets the authorizations granted to the device. We can, for example, limit the device to specific actions such as: connect, publish, and subscribe on specific topics.

Thing-Certificate-Policy

Below is an example policy allowing a device to publish and subscribe only to its own topics by using AWS IoT Core policy variables:

{
  "Version": "2012-10-17",
  "Statement": [
  {
      "Action": "iot:Connect",
      "Effect": "Allow",
      "Resource": "arn:aws:iot:<regoin>:<account>:client/${iot:Connection.Thing.ThingName}"
    },
    {
      "Action": [
        "iot:Receive",
        "iot:Publish"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:iot:<regoin>:<account>:topic/customer-a/${iot:Connection.Thing.ThingName}/*"
      ]
    },
    {
      "Action": [
        "iot:Subscribe"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:iot:<regoin>:<account>:topicfilter/customer-a/${iot:Connection.Thing.ThingName}/*"
      ]
    }
  ]
}

Notice how this policy uses a thing policy variable to simplify authorization. Instead of having to generate a policy for each thing, we can have a single policy which takes the thing name as a variable and restricts that thing to its own topics.

Now that security and scale concerns are addressed, SysAid still had to overcome the challenge of firewalls restricting outbound traffic for specific IP and port.

This is where the breadth and depth of AWS IoT Core comes in handy. AWS IoT Core supports a number of protocols and authentication methods allowing flexibility when connecting edge devices to AWS.

Using the MQTT over WebSockets protocol, the agent can relay messages to Web proxy servers attached to a static Elastic IP address, and listening on port 443, running at SysAid VPC.

In turn, the HTTP proxy forwards the traffic to AWS IoT endpoints.

MQTT over WebSocket protocol supports two authentication methods:

Using SigV4 requires the agent to connect AWS IoT Core using SigV4 credentials rather than the device certificate. To acquire SigV4 credentials the agent uses AWS IoT Core credential provider, which allows using the built-in X.509 certificate as the unique device identity to authenticate AWS requests. This eliminates the need to store an access key ID and a secret access key on your device.

Architecture diagram:

Architecture

Request flow:

  1. Agent resolves healthy static IP
  2. Agent acquires SigV4 credentials
  3. Agent signs a request and sends it to the Web proxy
  4. Web proxy forwards the request to an AWS IoT Core endpoint

Web proxies DNS is managed using Route 53 DNS Fail-over configuration. In simple configurations, you create a group of records that all have the same name and type, such as a group of weighted records with a type of A for example.com. You then configure Route 53 to check the health of the corresponding resources. Route 53 responds to DNS queries based on the health of your resources.

Conclusion

In this post, we gave an overview of how SysAid uses AWS IoT MQTT over WebSocket Secure to manage its large fleet of software agents behind restricted firewall rules. We showed that an AWS IoT thing can be thought of as much more than a physical device.

About the authors

Doron Bleiberg is a Senior Startup Solutions Architect with Amazon Web Services. He works with AWS customers to help them architect secure, resilient, scalable and high-performance applications in the cloud.

Jonathan Yom-Tov is a Senior Architect at SysAid Technologies Ltd, specializing in big data, data mining and cloud.