Guidance for Industrial Data Fabric Using Cognite Data Fusion® on AWS

Unifying and connecting industrial data

Overview

This Guidance shows how to implement an industrial data fabric framework using Cognite Data Fusion with AWS technology. The industrial data fabric approach addresses the challenges industrial organizations face in managing and deriving value from disparate and siloed data sources. Cognite Data Fusion is used to integrate, connect, and unify IT, operational technology (OT), and engineering data into a cohesive and accessible data environment. This consolidation and contextualization of industrial data helps organizations unlock data-driven insights and develop innovative applications that can drive improvements in production efficiency, operational sustainability, and decision-making.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Download the architecture diagram

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Cognite Data Fusion uses Amazon CloudWatch as the monitoring service to oversee the performance of various components, including application logs and error log monitoring through the use of AWS CloudTrail and Amazon EventBridge . This comprehensive monitoring approach allows organizations to trace events, analyze and visualize the performance of the technology stack, as well as conduct root cause analysis when errors occur.

Read the Operational Excellence whitepaper

This Guidance uses several security measures to mitigate cyber attack risks, including the use of Amazon Route 53 with AWS Shield Standard to protect against Distributed Denial of Service (DDoS) attacks. Additionally, it uses AWS Key Management Service (AWS KMS) and AWS Secrets Manager to encrypt and secure sensitive secrets and keys. Furthermore, this Guidance uses AWS Identity and Access Management (IAM) to manage permission policies and scope appropriate permission levels.

Read the Security whitepaper

Elastic Load Balancing (ELB) automatically distributes incoming application traffic across multiple targets for high availability and fault tolerance. Amazon Elastic Compute Cloud (Amazon EC2) provides reliable and scalable compute capacity, enabling the automatic scaling of infrastructure up or down based on demand, while Amazon Elastic Container Service (Amazon ECS) simplifies the deployment and management of containerized applications, offering a highly reliable and scalable platform. Additionally, Amazon RDS handles database management tasks to ensure reliability and availability. Amazon Simple Queue Service (Amazon SQS) improves the reliability and fault tolerance of microservices and distributed systems. Finally, AWS Backup facilitates the centralization and automation of data backups across AWS services and on-premises resources, enhancing the overall data protection strategy.

Read the Reliability whitepaper

Amazon ElastiCache provides sub-millisecond response times, improving the performance of data-intensive applications by caching frequently accessed data in-memory and reducing the load on databases and backend systems. Lambda and AWS Step Functions work together to support performance efficiency. Lambda allows you to run code in response to events or requests without managing servers, while Step Functions orchestrates multiple Lambda functions into optimized serverless workflows. Additionally, Amazon Kinesis Firehose is a fully managed service for real-time data ingestion, enabling low-latency capture, transformation, and loading of streaming data into data stores and analytics services. These AWS services collectively deliver high performance efficiency through serverless compute, in-memory caching, and real-time data processing capabilities that automatically scale.

Read the Performance Efficiency whitepaper

Amazon EC2 Auto Scaling is a key service for cost efficiency, allowing you to automatically scale compute resources up or down based on demand, optimizing infrastructure costs by only paying for resources used. AWS Cost Explorer provides detailed visibility into AWS spending, enabling identification of cost optimization opportunities and informed resource utilization decisions. Amazon RDS autoscaling extends these benefits to database infrastructure, automatically scaling storage and compute capacity to handle demand changes without manual intervention or over-provisioning. Using these services allows right-sizing infrastructure, eliminating waste, and optimizing costs.

Read the Cost Optimization whitepaper

This Guidance uses Lambda functions for optimal compute resource allocation by stopping and starting resources based on predefined schedules. By efficiently managing compute resources based on demand, the use of Lambda functions minimizes unnecessary energy consumption during idle periods. This enables optimal resource utilization by automatically scaling based on demand, resulting in reduced energy usage compared to traditional server-based models.

Read the Sustainability whitepaper

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages

Guidance for Industrial Data Fabric Using Cognite Data Fusion® on AWS

Overview

How it works

Well-Architected Pillars

Disclaimer

Did you find what you were looking for today?

Learn

Resources

Developers

Help