[SEO Subhead]
This Guidance shows how machine learning (ML) models can be applied to Internet of Things (IoT) sensor data to predict component or system failures before they happen and recommend appropriate maintenance steps. Aerospace manufacturing, aircraft operations, and other manufacturing and industrial domains use IoT devices to identify patterns in sensor output data to predict preventative maintenance operations needed to prevent system failures and downtime. This Guidance helps you use that data to reduce unplanned downtime of manufacturing lines, aircrafts, and other systems.
Please note: [Disclaimer]
Architecture Diagram

[Architecture diagram description]
Step 1
Source data is generated by multiple sources. The aircraft generates flight logs, transmitted wirelessly through the Aircraft Communication Addressing and Reporting system (ACARS) or recorded through a Quick Access Recorder (QAR). Maintenance, Repair, and Overhaul (MRO) facilities generate maintenance records. Airlines broadcast delay and cancellation notices as flight ops events.
Well-Architected Pillars

The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
Amazon CloudWatch maintains telemetry on the running system to alert on conditions such as failed AWS Glue extract, transform, and load (ETL) jobs (indicating formatting errors in aircraft data) or error codes returned by API Gateway (indicating configuration problems with the maintenance application or website). Aurora is configured to generate automatic backups of aircraft and prediction data and can rapidly restore those backups.
Automated telemetry and alarms help identify when the system is not meeting desired business outcomes and can help to quickly identify underlying issues before the customer detects or reports them. Errors can be detected and reported both in external customer systems (such as the ACARS system or QAR processing) in addition to services in the AWS Cloud. Automated database backup and restoration allows for quicker recovery to normal status in the event of a failure or disruption.
-
Security
Amazon S3, AWS Glue, and Kinesis Data Streams enforce mutual TLS for encryption of all customer data (such as aircraft, flight ops, and maintenance data) ingested to the cloud. Amazon S3 and Aurora, where all customer data is stored, enforce encryption on all data in storage. Customer data is encrypted at all times, whether in transit or at rest. This ensures that sensitive data about flight operations and aircraft repair data is only visible to authorized users.
AWS Glue is configured to eliminate privacy-regulated data from the dataset upon ingestion. API Gateway enforces user access control by requiring an authentication token provided by AWS IAM Identity Center, which manages user credentials and roles. User authentication functions help ensure that user credentials are securely managed and rotated, with users allocated to groups with specific access rights according to job role (such as mechanic, supervisor, data scientist or admin), following the least privilege principle. Group- and role-based access management help ensure that user access rights are securely and consistently managed at scale across all organizations.
-
Reliability
Amazon S3 and Aurora provide a high degree of data durability with multi-Availability Zone data replication in addition to automation and restoration of data backups. Data durability ensures that all data required to make maintenance predictions is available and can be restored in the event of a failure.
Lambda, AWS Glue, SageMaker and API Gateway are fully managed services with automated scaling of resources. Loss of an Availability Zone or database replica will not take down the preventative maintenance system; these services will automatically divert requests from failed resources to healthy ones. The managed services provide automated failover without user intervention and without additional cost.
Kinesis Data Streams automatically scales data ingestion and throttles throughput to match downstream processing rates. The autoscaling of compute resources and auto-throttling of data streams helps ensure that the system can adapt reliably to traffic bursts related to events such as higher flight volume or uploads of large maintenance record batches.
-
Performance Efficiency
SageMaker and Aurora report utilization metrics to CloudWatch, allowing you to monitor historical utilization of computing resources. CloudWatch alarms can be configured to invoke scale-in or scale-out operations in Aurora and SageMaker to match changing demand. For example, if the alarm signals low utilization of database instances, it could automatically eliminate a database replica or the operator could select a smaller database instance type.
CloudWatch instrumentation provides real-time visibility to changes in system utilization, allowing deeper insight into when computing resources are right-sized to the predictive maintenance application. Based on this information, you can adapt computing resources, such as allocating larger or smaller instance types for the SageMaker prediction inference endpoint or an Amazon Redshift data warehouse for maintenance analytics.
-
Cost Optimization
Amazon S3 provides automated lifecycle management of data, moving infrequently-accessed data to lower-cost Amazon S3 Glacier storage tiers. This can save significant cost in retaining legacy flight and component records that may be outdated but still relevant for infrequent reports or model training. The automated tiering or retiring of older data reduces storage costs while maintaining a long service history for making accurate maintenance predictions.
Additionally, Lambda and AWS Glue provide serverless computing and data transformation that automatically scale resources up or down to match real-time demand signals; you only pay for the actual computing time used for maintenance predictions. The fully managed, serverless computing resources help to avoid cost waste by automatically scaling resources based on real-time demand. This is important because system utilization will be inherently cyclic: data from flight ops, ACARS, and QAR systems will peak during the daytime or peak travel seasons and wane at night or during off-peak seasons.
-
Sustainability
Aurora and Athena both support compression of underlying data sources. Compression of system data (such as maintenance logs or flight records) significantly reduces the data storage requirements of the predictive maintenance system, reducing the system’s environmental impact.
Implementation Resources

A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content

[Title]
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.