The Internet of Things on AWS – Official Blog
How to manage IoT device certificate rotation using AWS IoT
Introduction
The Internet of Things (IoT) is transforming business operations and customer experiences across a variety of industries. This unlimited opportunity enables business transformation, but if not implemented correctly, it also brings security, risk, and privacy concerns, compromising your data and brand. In industrial facilities, OT (Operational Technology) environments are leveraging more IT solutions to improve production output and efficiencies. As digital transformation initiatives continue to accelerate IT/OT convergence, they also blend risks between the IT and OT environments. With the growing number of connected devices in consumer, enterprise and industrial applications, and the data generated, the potential for security events raises questions about how to address security risks posed by IoT devices and device communication to and from the cloud.
To protect customers, devices, and companies, every IoT solution should start and end with security. The best IoT security solution offers multi-layered protection from the edge to the cloud, securing your IoT devices, connectivity, and data. Provisioning IoT devices and systems with unique identities and credentials is one of the core building blocks of any IoT solution. The AWS IoT security model requires that each connected device have a credential to interact with AWS IoT and that all traffic to and from AWS IoT be sent securely over Transport Layer Security (TLS).
Customers are responsible for managing device credentials (X.509 certificates, AWS credentials, Amazon Cognito identities, federated identities, or custom authentication tokens) and policies in AWS IoT. X.509 certificates provide AWS IoT with the ability to authenticate client and device connections. AWS provides several different ways to provision a device and install unique client certificates on it. In addition to strong identity for IoT devices, AWS recommends the use of hardware protected modules such as Trusted Platform Modules (TPMs) or hardware security modules (HSMs) for storing credentials and performing authentication operations. X.509 certificates provide a strong, standardized mechanism with renewable, password-less authentication. These certificates must be provisioned from a trusted public key infrastructure (PKI) and have a renewal lifetime appropriate for the security posture of their business use. Their renewal must be automatic (often based on device health) to minimize any potential access disruption due to manual rotation.
In case TPM or HSM is not available on the device, consider rotating credentials more often based on the business use case. Any access granted to a device should be granted based on its strong identity. Credential audit and monitoring, rotation, and revocation must be supported to enable immediate removal of device access (for example, to respond to compromise). This blog provides prescriptive guidance on addressing security concerns related to the audit and rotation of device credentials on IoT devices and edge gateways which connect to AWS IoT.
Solution overview
In this post, we describe the certificate rotation process based on the AWS managed Certificate Authority. We illustrate the overall solution in the following diagram.
The sequence diagram (click to enlarge) presents all steps involved in the certificate rotation process. Subsequent diagrams will use the step numbers from this sequence to illustrate parts of this process.
Solution walk through
1. Identify devices with certificates which are going to expire
The proposed solution for IoT Thing certificate rotation leverages the AWS IoT Device Defender Scheduled Audit functionality and best practices of serverless system design.
The AWS IoT Device Defender Audit, audits your device-related resources (such as X.509 certificates, IoT policies, and Client IDs) against AWS IoT security best practices (for example, the principle of least privilege or unique identity per device). We will use one of predefined audit checks DEVICE_CERTIFICATE_EXPIRING_CHECK. This check verifies if a device certificate is expiring within 30 days or has expired.
You can enable automation by configuring the Audit notifications to send Amazon SNS alerts and trigger automated actions.
Amazon SNS Subscription with Lambda Endpoint automatically triggers a Lambda function when new message arrives.
Triggered Lambda function receives an event
including following attributes:
{
…
"taskId": "e843de58c4f7536021030936fb83d04a",
"nonCompliantChecksCount": 1,
"checkName": "DEVICE_CERTIFICATE_EXPIRING_CHECK"
…
}
Using taskId
, Lambda function queries the IoT Core to list the identifiers of expiring certificates, then finds the client ids associated with obtained principals (certificates).
At this stage, Lambda function gathered all required information to start the certificate rotation process.
2. Device generates a new Certificate Signing Request (CSR)
Lambda sends a MQTT message for specified management topic:
Topic: management/topic/{clientId}/csr_req
Message: {}
Device receives the CSR_REQ
message and starts the certificate rotation routine.
Rotation of the Private Key used by IoT device is optional but recommended and should be applied when appropriate based on the business use case.
Device generates new CSR (Certificate Signing Request) and sends it as a payload of MQTT message:
Topic: management/topic/{clientId}/csr_res
Message: {'CSR': CSR}
AWS IoT Core uses Rules to forward MQTT messages to the appropriate Lambda function.
Built-in clientid()
function returns the Id of the MQTT client which sent the message.
This is important because we are using the single-level wildcard ('+'
) to match any client id in our Rule. This way we can use one IoT Rule to manage certificate rotation for every connected device in our fleet.
Rule query statement:
SELECT clientid() as clientid, * FROM 'management/topic/+/csr_res'
3. New device certificate is created and sent to the device
Lambda calls AWS IoT Core to create a new certificate based on received CSR and attaches the same IoT Policy which was used by the expiring certificate. The process described leverages the Amazon Root certificate authority (CA) to sign certificates used by devices. It is possible to use the Customer owned certificate authority to sign certificates; in that case, the certificate rotation process needs to be implemented on the customer’s side and Device Defender Audit can be used to trigger it.
Finally, Lambda returns certificate to device as a payload of an MQTT message.
Device stores the new certificate and establishes a new MQTT session using rotated credentials.
If connection using new certificate was successful, device ends rotation process by sending following MQTT message:
Topic: management/topic/{clientId}/crt_ack
Message: {}
In case of any issues, device connects to AWS IoT Core using existing credentials and reports errors:
Topic: management/topic/{clientId}/crt_err
Message: {'error': error_msg}
IoT Rule executes Lambda function for automated error resolution and sends notification to support team using the Amazon Simple Notification Service (SNS).
As the final step of successful certificate rotation, Lambda deactivates and deletes the old certificate previously used by the IoT device.
This solution is resilient in case of any disturbances in the certificate rotation process. If the IoT device crashes while receiving the new certificate (or encounters any other issues), it will appear in the Device Defender Audit results the next day, and the process will start from scratch.
Audit check includes devices with certificates expiring within 30 days, so a device can operate without any impact on production workloads while support staff have time to investigate a potential rotation issue.
Conclusion
AWS recommends a multilayered security approach to secure IoT solutions, including the use of strong identities, least privileged access, continuous monitoring of device health and anomalies, secure connections to devices to fix issues, and applying updates to keep devices up to date and healthy. When you use X.509 certificates for digital identity and authentication, you may need to renew the certificate during the lifetime of the device. The length of the certificate validity depends on the device health and business context, and you’ll need a strategy for certificate renewal. Although shorter certificate validity periods require more involvement, AWS IoT makes rotation of device certificates easier to execute and moreover, helps you improve your IoT system’s security posture.
To learn more about IoT security best practices, visit The Internet of Things on AWS – Official Blog
About the authors
Ryan Dsouza is a Principal Solutions Architect for IoT at AWS. Based in New York City, Ryan helps customers design, develop, and operate more secure, scalable, and innovative solutions using the breadth and depth of AWS capabilities to deliver measurable business outcomes. Ryan has over 25 years of experience in digital platforms, smart manufacturing, energy management, building and industrial automation, and OT/IIoT security across a diverse range of industries. Before AWS, Ryan worked for Accenture, SIEMENS, General Electric, IBM, and AECOM, serving customers for their digital transformation initiatives. |
Lukasz Malinowski is an IoT Consultant at AWS Professional Services based in Poland. Lukasz has over 15 years of experience in designing, building and securing distributed systems spanning from the on-premise devices, secure edge gateway servers to the native cloud components. |