Guidance for Core Banking Data Lake on AWS
Overview
How it works
These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational Excellence
Amazon CloudWatch provides comprehensive visibility into system performance and health, enabling you to configure CloudWatch alarms that invoke automated actions for proactive issue resolution. CloudTrail maintains a detailed audit trail of API calls and configuration changes, enhancing compliance and security efforts. AWS Glue automates data management processes, offering automated schema discovery, classification, and catalog management, even for large datasets.
Security
AWS DMS offers data protection during transit through its encryption capabilities, and Amazon S3 simplifies security by automatically encrypting all new objects at rest. IAM policies adhere to the principle of least privilege, scoping permissions to the minimum required. Additionally, Lake Formation defines granular security policies, restricting access at the database, table, column, row, and cell levels. AWS Key Management Service (AWS KMS) centrally manages encryption keys used across AWS services.
Reliability
Amazon S3 provides highly durable and redundant storage, replicating data across multiple Availability Zones. Amazon S3 versioning preserves, restores, and retrieves previous object versions. To further improve reliability, Amazon Redshift enhances data warehouse resilience through automatic backups, failure remediation, and multi-AZ deployment options. Amazon EMR provides configuration options to help you control automatic termination of clusters once steps are completed and to terminate clusters due to errors or issues before processing.
Performance Efficiency
Amazon EMR optimizes data processing by enabling right-sizing of clusters, dynamic scaling, and preconfigured environments. Amazon Redshift unlocks performance potential through features like partitioning, columnar compression, and query tuning so you can optimize data processing and reduce storage and I/O requirements. With Lake Formation, you can streamline data lake management by simplifying the process of identifying and moving data into a centralized repository, whether that data is structured or unstructured.
Cost Optimization
Amazon S3 Intelligent-Tiering and lifecycle policies automate cost savings by seamlessly moving data to the most cost-effective storage tiers. Amazon EMR optimizes costs through auto-scaling and Amazon Elastic Compute Cloud (Amazon EC2) Spot instance utilization. Additionally, Amazon Redshift offers reserved sodes for steady-state workloads and Amazon Redshift Serverless for cost-effective scaling of unpredictable workloads.
Sustainability
The energy-efficient infrastructure of Amazon S3 and the resource optimization capabilities of managed services like Amazon Redshift, Amazon EMR, AWS DMS, QuickSight, and Lake Formation reduce environmental impact and lower overall IT footprint and carbon emissions compared to running on-premises with physical servers and hardware.
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages