Guidance for Industrial Data Fabric Using Cognite Data Fusion® on AWS
Unifying and connecting industrial data
Overview
How it works
These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational Excellence
Cognite Data Fusion uses Amazon CloudWatch as the monitoring service to oversee the performance of various components, including application logs and error log monitoring through the use of AWS CloudTrail and Amazon EventBridge . This comprehensive monitoring approach allows organizations to trace events, analyze and visualize the performance of the technology stack, as well as conduct root cause analysis when errors occur.
Security
This Guidance uses several security measures to mitigate cyber attack risks, including the use of Amazon Route 53 with AWS Shield Standard to protect against Distributed Denial of Service (DDoS) attacks. Additionally, it uses AWS Key Management Service (AWS KMS) and AWS Secrets Manager to encrypt and secure sensitive secrets and keys. Furthermore, this Guidance uses AWS Identity and Access Management (IAM) to manage permission policies and scope appropriate permission levels.
Reliability
Elastic Load Balancing (ELB) automatically distributes incoming application traffic across multiple targets for high availability and fault tolerance. Amazon Elastic Compute Cloud (Amazon EC2) provides reliable and scalable compute capacity, enabling the automatic scaling of infrastructure up or down based on demand, while Amazon Elastic Container Service (Amazon ECS) simplifies the deployment and management of containerized applications, offering a highly reliable and scalable platform. Additionally, Amazon RDS handles database management tasks to ensure reliability and availability. Amazon Simple Queue Service (Amazon SQS) improves the reliability and fault tolerance of microservices and distributed systems. Finally, AWS Backup facilitates the centralization and automation of data backups across AWS services and on-premises resources, enhancing the overall data protection strategy.
Performance Efficiency
Amazon ElastiCache provides sub-millisecond response times, improving the performance of data-intensive applications by caching frequently accessed data in-memory and reducing the load on databases and backend systems. Lambda and AWS Step Functions work together to support performance efficiency. Lambda allows you to run code in response to events or requests without managing servers, while Step Functions orchestrates multiple Lambda functions into optimized serverless workflows. Additionally, Amazon Kinesis Firehose is a fully managed service for real-time data ingestion, enabling low-latency capture, transformation, and loading of streaming data into data stores and analytics services. These AWS services collectively deliver high performance efficiency through serverless compute, in-memory caching, and real-time data processing capabilities that automatically scale.
Cost Optimization
Amazon EC2 Auto Scaling is a key service for cost efficiency, allowing you to automatically scale compute resources up or down based on demand, optimizing infrastructure costs by only paying for resources used. AWS Cost Explorer provides detailed visibility into AWS spending, enabling identification of cost optimization opportunities and informed resource utilization decisions. Amazon RDS autoscaling extends these benefits to database infrastructure, automatically scaling storage and compute capacity to handle demand changes without manual intervention or over-provisioning. Using these services allows right-sizing infrastructure, eliminating waste, and optimizing costs.
Sustainability
This Guidance uses Lambda functions for optimal compute resource allocation by stopping and starting resources based on predefined schedules. By efficiently managing compute resources based on demand, the use of Lambda functions minimizes unnecessary energy consumption during idle periods. This enables optimal resource utilization by automatically scaling based on demand, resulting in reduced energy usage compared to traditional server-based models.
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages