Guidance for SAP Sustainability Data Lake on AWS
Overview
How it works
These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational Excellence
To respond to incidents and events while operating this Guidance, Amazon CloudWatch can be seamlessly integrated to collect and visualize logs, metrics, and event data. This allows customers to create alarms that alert them of operational anomalies.
Security
We recommend data be encrypted at rest using AWS Key Management Service (AWS KMS) with customer-managed AWS KMS keys. The keys should be rotated on a regular schedule. Services like Kinesis Data Streams, AWS Glue, and Amazon S3 all integrate with AWS KMS for easy encryption. For data in transit, customers should ensure any application connections require SSL/TLS.
Reliability
This Guidance is designed with services that have initial service limits that accommodate a large majority of customer workloads. If necessary, service quotas can be expanded. For example, a customer can increase the number of concurrent executions of AWS Glue jobs or concurrent active data manipulation language (DML) queries in Athena.
Performance Efficiency
This Guidance uses serverless managed services that automatically scale up and down in response to changing demand, reducing resource overhead.
Storing data in Amazon S3 allows consumers to bring various tools or services to their data, dependent on their needs. For example, customers can query data directly in Amazon S3 using Athena, or they can use QuickSight for a business intelligence (BI) dashboard.
Cost Optimization
This Guidance relies on serverless AWS services like AWS Glue, Step Functions, and Athena that are fully managed and automatically scale according to workload demand. As a result, customers only pay for what they use.
Sustainability
Data in Amazon S3 can be stored in more efficient file formats (such as Parquet) to prevent unnecessary processing and reduce the overall storage required.
Amazon S3 lifecycle policies can automatically move less volatile data to more energy-efficient storage classes (such as Amazon S3Glacier) that use magnetic storage rather than solid state memory. Deletion timelines can also be enforced to minimize overall storage requirements.
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages