Skip to main content

AWS Solutions Library

Guidance for SAP Sustainability Data Lake on AWS

Overview

This Guidance demonstrates how to combine and consolidate greenhouse gas emissions data from SAP and non-SAP sources using AWS services. Customers who use Enterprise Resource Planning (ERP) solutions to manage and optimize their business processes can build a data lake that facilitates the generation of carbon footprint insights.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

To respond to incidents and events while operating this Guidance, Amazon CloudWatch can be seamlessly integrated to collect and visualize logs, metrics, and event data. This allows customers to create alarms that alert them of operational anomalies.

Read the Operational Excellence whitepaper 

We recommend data be encrypted at rest using AWS Key Management Service (AWS KMS) with customer-managed AWS KMS keys. The keys should be rotated on a regular schedule. Services like Kinesis Data Streams, AWS Glue, and Amazon S3 all integrate with AWS KMS for easy encryption. For data in transit, customers should ensure any application connections require SSL/TLS.

Read the Security whitepaper 

This Guidance is designed with services that have initial service limits that accommodate a large majority of customer workloads. If necessary, service quotas can be expanded. For example, a customer can increase the number of concurrent executions of AWS Glue jobs or concurrent active data manipulation language (DML) queries in Athena.

Read the Reliability whitepaper 

This Guidance uses serverless managed services that automatically scale up and down in response to changing demand, reducing resource overhead.

Storing data in Amazon S3 allows consumers to bring various tools or services to their data, dependent on their needs. For example, customers can query data directly in Amazon S3 using Athena, or they can use QuickSight for a business intelligence (BI) dashboard.

Read the Performance Efficiency whitepaper 

This Guidance relies on serverless AWS services like AWS Glue, Step Functions, and Athena that are fully managed and automatically scale according to workload demand. As a result, customers only pay for what they use.

Read the Cost Optimization whitepaper 

Data in Amazon S3 can be stored in more efficient file formats (such as Parquet) to prevent unnecessary processing and reduce the overall storage required.

Amazon S3 lifecycle policies can automatically move less volatile data to more energy-efficient storage classes (such as Amazon S3Glacier) that use magnetic storage rather than solid state memory. Deletion timelines can also be enforced to minimize overall storage requirements.

Read the Sustainability whitepaper 

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.

Did you find what you were looking for today?

Let us know so we can improve the quality of the content on our pages