This Guidance helps downstream energy operators deploy a secure and modernized industrial data environment. It uses independent software vendor (ISV) partner products and AWS services to ingest plant sensor data, then route, store, and analyze the data for visualization and reporting. Data from disparate sources, such as refining and petrochemical industrial data, can be brought into a centralized repository in near real-time to help refineries with predictive equipment maintenance, process planning, and greenhouse gas (GHG) emissions management.
Please note: [Disclaimer]
Architecture Diagram
Step 1
This diagram shows how to deploy an industrial data lake using Amazon Simple Storage Service (Amazon S3), AWS Glue, and Amazon Timestream. These are purpose-built storage services for time series data.
AWS Glue crawlers and AWS Glue Data Catalog organize data sources and relationships, and notify administrators using Amazon Simple Notification Service (Amazon SNS).
Step 2
A partner Operational Technology (OT) gateway device extracts plant sensor data from the distributed control system (DCS) fed historian. AWS IoT Core brings messages into the AWS cloud from sensors and gateway device. AWS IoT Core rules route messages to Amazon S3 and Timestream.
Step 3
Paper and digital documents contain valuable operational data that uses an Amazon S3 prompt to call an AWS Lambda function to analyze and structure document text using Amazon Textract. Results are stored in an industrial data lake.
Step 4
Information Technology (IT) enterprise system data from on-premises databases is synchronized with Amazon Aurora. Data contains asset locations, maintenance history, lab samples, and critical contextual information for OT telemetry data patterns.
Step 5
Data analytics capabilities from Amazon Athena provide contextualized datasets of OT data, joined with enterprise systems of record and static documents. Queries and views can be applied, reused, and shared in Athena.
Step 6
Visualization and reporting is achieved with Amazon QuickSight, Amazon Managed Grafana, or partner business intelligence (BI) applications based on your preference. Amazon Managed Grafana provides real-time monitoring and QuickSight focuses on business key performance indicators (KPIs).
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
Amazon SNS allows refineries to safely operate this Guidance and respond to incidents and events. This service notifies the operator with near real-time insights into technical and functional anomalies. It also notifies them of key milestones during the monitoring processes.
-
Security
To protect data in this Guidance, data at rest in Amazon S3, Aurora, and Timestream are encrypted using AWS Key Management Service (AWS KMS), and transferred over a secure network connection. We recommend using AWS CloudTrail to access and investigate logs.
-
Reliability
Event-driven prompts in AWS IoT Core and Amazon S3 operate on both new and changed data simultaneously, allowing faultless retries for data ingestion, contextualization, and preparation for data science workloads.
-
Performance Efficiency
AWS managed services such as Lambda, Amazon Textract, AWS Glue, and Athena provide built-in elasticity and monitoring of workloads, so that the services scale for optimal performance and align with the workload demand.
-
Cost Optimization
Serverless managed services are used in this Guidance, including Amazon S3, for general data savings on storage costs. Serverless technologies provide true consumption-based pricing that puts the customer in control.
-
Sustainability
Amazon S3 provides multiple storage classes, including the Amazon S3 Intelligent-Tiering storage class, that automates storage cost savings by moving data when access patterns change, maximizing sustainability and minimizing resource usage.
Implementation Resources
A detailed guide is provided to experiment and use within your AWS account. Each stage of building the Guidance, including deployment, usage, and cleanup, is examined to prepare it for deployment.
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Related Content
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.