Industrial Data Fabric (IDF) solutions on AWS help you create the data management architecture that enables scalable, unified, and integrated mechanisms to harness data as an asset. This Guidance helps Manufacturing & Industrial companies mobilize large numbers of industrial data sets by integrating AWS Services with AWS Partner Solutions.
An IDF helps you define and understand the value of transforming manufacturing and industrial operations by applying a proven, governed, data-driven approach. In this Guidance, IDF is established using an integrated stack of AWS Services and AWS Partner Solutions. Element Unify allows IT and OT teams to build rich data context at scale with automated data pipelines. The result is a single federated source of data from which users can establish a single version of the truth, with a governance engine ensuring data integrity. HighByte Intelligence Hub is an edge-native DataOps solution built for industrial data. HighByte Intelligence Hub bridges the OT and IT gap by integrating industrial information across many systems and maintaining these integrations throughout the life of the factory.
1: Delivering Industrial DataOps (IDO) on IDF
This is a high-level architecture for IDF on AWS. It shows all the AWS services available for delivering IDF use cases.
This architecture helps you to create an enterprise governed model, ingest near real-time and historical data at scale from edge data sources into the IDF and interface with applications using REST APIs.
A Governed Data Model is defined in a central location. This could be in an AWS Partner Application or a corporate repository. The centralized model can be populated with metadata from corporate systems such as enterprise resource planning (ERP), piping and instrumentation diagram (P&ID), and computerized maintenance management system (CMMS).
The governed model is then brought into an edge application where additional metadata from systems, such as manufacturing execution system (MES), can be added. Data streams can then be mapped to the imported model. Models can also be created at the plant edge and moved to the centralized model repository.
Data ingestion service selection depends on the source. You can stream data feeds from a variety of sources using AWS DataSync for a file share, Amazon Kinesis, Amazon Managed Streaming for Apache Kafka (Amazon MSK), AWS IoT Core, AWS IoT SiteWise for near real time data ingestion, or AWS Transfer for SFTP.
Data storage is optimized for the workload, which can include Amazon DynamoDB for key value and document data structures, Amazon Simple Storage Service (Amazon S3) for object storage, Amazon Neptune for graph use cases, Amazon Redshift for data warehousing, Amazon Timestream for time series data, and AWS IoT SiteWise to organize industrial equipment data. Data from AWS IoT SiteWise can be integrated with AWS IoT TwinMaker.
Data egress from the Industrial Data Fabric (IDF) is accomplished with connectors directly to AWS services, or through an API Gateway to supported AWS services. For services that are not supported by API Gateway, an OAuth 2.0 pattern with Amazon Cognito is used to generate AWS temporary tokens.
2: Deployment Architecture
This architecture prescribes the deployment pattern for Industrial DataOps (IDO) using HighByte Intelligence Hub and Element Unify.
This architecture helps you create an enterprise governed model, ingest real-time and historical data at scale from edge data sources, into IDF on AWS, and interface with applications using REST APIs.
A Governed Data Model is defined in Element Unify. The centralized model can be populated with meta-data from corporate systems such as ERP and CMMS.
The governed model is then brought into HighByte Intelligence Hub, where additional meta-data from systems, such as MES, can be added. Data streams can then be mapped to the imported model.
Modeled data is sent to AWS IoT SiteWise. AWS IoT SiteWise models and hierarchies are created to match the source data. Batch data (more than 7 days old) is sent to Amazon S3 where it is merged with AWS Iot SiteWise.
Modeled data can also be sent directly from HighByte Intelligence Hub into Amazon S3 on a time-based interval for analytics applications. Data from AWS IoT SiteWise can be integrated with AWS IoT TwinMaker. Amazon Redshift is used for data warehousing.
You can use AWS AI/ML services such as SageMaker to build, train, and deploy ML models.
You can use AWS analytics services such as AWS Glue and Athena for data processing.
AWS WAF provides protection from web exploits while API Gateway with Amazon Cognito provides REST method support for Amazon S3, AWS IoT SiteWise, and Amazon RedShift.
Your applications can access the AWS IDF data through the API Gateway using REST calls. Integration may be through Authentication with AWS Signature Version 4 credentials or OAuth 2.0 when AWS Credentials aren’t available. Amazon Managed Grafana can be used to visualize AWS IoT SiteWise and AWS IoT TwinMaker data.
3: HighByte Intelligence Hub Industrial DataOps on AWS
This architecture shows how customers can start by developing asset models at the plant edge, map data sources to the asset model, and move the contextualized data streams to the AWS cloud.
HighByte Intelligence Hub (Intelligence Hub) integrates operational technology (OT) with IT by providing a solution to easily and quickly integrate industrial information across multiple systems and enable OT teams to model, transform, and share plant floor data with IT systems.
HighByte Intelligence Hub consumes both real-time and asset model data from a myriad of edge data sources, including relational databases and AWS IoT Greengrass through standard industrial protocol input connectors. This includes data ingestion from industrial historians, such as Inductive Automation’s Ignition Server and Aveva’s PI System.
HighByte Intelligence Hub enables you to standardize, organize, and merge your industrial data into a single equipment model. Then, using flows, you can route the asset models to multiple output connectors, each with a different frequency.
HighByte Intelligence Hub provides the ability to bring in asset models and timeseries based sensor data into Amazon S3. AWS Lake Formation helps users collect and catalog data from databases and object storage, move the data into Amazon S3, and clean and classify data using ML algorithms. Data is accessed through a centralized AWS Glue Data Catalog.
HighByte Intelligence Hub enables users to build asset models within the HighByte Intelligence Hub editor and deploy the model directly to AWS IoT SiteWise along with the streaming data. This enables users to calculate and visualize metrics from telemetry data using AWS IoT SiteWise Monitor.
HighByte Intelligence Hub can connect directly to AWS IoT Core through its native Message Queuing Telemetry Transport (MQTT) service, or use AWS IoT Greengrass locally. HighByte Intelligence Hub also enables bi-directional communication with AWS IoT Core and AWS IoT Greengrass.
HighByte Intelligence Hub can connect directly to Amazon Kinesis Data Streams for massively scalable and durable real-time data streaming. Streaming data can be transformed and analyzed in real-time using Amazon Managed Service for Apache Flink, and sent to Kinesis Data Firehose. Also, timeseries data can also be sent to Timestream from Managed Service for Apache Flink.
Telemetry data is published in near real-time to Kinesis Data Firehose by either an AWS IoT Core rule, Kinesis Data Streams, or HighByte Intelligence Hub Kinesis Data Firehose connector. This loads the streaming data reliably into an Amazon S3 data lake.
Use Amazon Redshift to store structured data sets and analytics results in a data warehouse. Data into Amazon Redshift can be ingested either through AWS Glue from Amazon S3 or directly through the HighByte Intelligence Hub Redshift connector. Also, you can use Athena to query Amazon S3 through the HighByte Intelligence Hub JDBC connector using the Amazon Athena JDBC Driver.
Create business intelligence reports and visualize data from Amazon Redshift and Amazon S3 with Amazon QuickSight and Athena.
When real-time and historical data is available in an Amazon S3, Amazon Lookout for Equipment uses the data to detect abnormal equipment behavior, so that potential machine failures are detected before failures occur and unplanned downtime is avoided.
Computed metrics can be written back into Amazon S3 for storage and consumption. Custom machine learning models can be developed with SageMaker.
4: Element Unify on AWS
This architecture shows how Element Unify on AWS helps customers build and maintain a standardized asset model and manage relationships across all operations data domains.
Element Unify on AWS allows integration, contextualization and governance of industrial IT and OT data using its industrial data operations platform – Element Unify 5.0. This streamlines the process of turning industrial data in useful insights or actionable alerts in AWS cloud.
Element Unify retrieves PI Asset Framework (AF) asset models and time series data using Unify PI Time Series (TS) Connector and Unify PI Asset Framework Connector.
SAP®, Enterprise Asset Management (EAM), ERP, Laboratory Information Management System (LIMS), and other structured data are ingested using Element Unify ODBC/JDBC Connectors.
Element Unify S3 Connector Lambda imports and exports asset models and tag lists exported from edge applications.
Element Unify SiteWise Connector Lambda provisions enriched hierarchical asset models into AWS IoT SiteWise.
Element Unify TwinMaker Connector Lambda publishes a graph model to AWS IoT TwinMaker in the form of components, documents, and parameters.
Unify Redshift Connector Lambda can be used to deploy asset model and time series data from PI Server to Amazon Redshift.
AWS IoT SiteWise publishes an MQTT message to AWS IoT Core each time a property value updates. AWS IoT Core publishes the new property values to Kinesis Data Firehose which transforms and delivers the data to Amazon S3 industrial data lake.
Lookout for Equipment can use the data to detect abnormal equipment behavior and provide early failure detection and avoid unplanned downtime.
Athena, in conjunction with Quicksight or Amazon Managed Grafana, can be used to develop interactive business intelligence (BI) dashboards.
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational ExcellenceThe majority of AWS services used by this Guidance, such as Amazon S3 and API Gateway, are serverless, lowering the operational overhead of maintaining the Guidance. This also allows you to evolve the design pattern in a continuous cycle of improvement over time.
This Guidance leverages AWS Security Token Service (AWS STS) and Amazon Cognito. These services allow you to take advantage of cloud technologies to protect data, systems, and assets in a way that can improve your security posture.
Security Best Practices for Manufacturing OT describes how to design, deploy, and secure distributed manufacturing workloads and resources at the industrial edge.
This Guidance uses many of the AWS managed services to allow for a highly available network topology. Availability and reliability are managed on your behalf by AWS service teams (for example, Amazon S3, AWS IoT SiteWise, and Amazon Cognito).
This Guidance uses purpose-built storage services, such as Amazon S3, that can reduce latency and increase throughput. You can use cross-region replication (CRR) to provide lower-latency data access to different geographic Regions. This Guidance provides multiple data-driven approaches to meet your workload requirements of scaling, traffic, and data access patterns.
This Guidance uses purpose-built storage services, such as Amazon S3, that can reduce latency and increase throughput.
This Guidance utilizes scalable services, such as Amazon S3, to align the services to your needs. Its functionalities are implemented by using a serverless architecture (including Amazon Cognito and API Gateway). Your resources are available only when needed and do not run constantly.
A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
HighByte Intelligence Hub, the HighByte Intelligence Hub logo, and any other HighByte Intelligence Hub trademark are trademarks of HighByte and are used here with permission. Element Unify, the Element Unify logo, and any other Element Unify trademark are trademarks of Element Unify and are used here with permission.