[SEO Subhead]
This Guidance provides the essential data foundation for empowering customers to build data and analytics solutions. It shows how to integrate data from SAP ERP source systems and AWS in real-time or batch mode, with change data capture, using AWS services, SAP products, and AWS Partner Solutions. This Guidance includes an overview reference architecture showing how to ingest SAP systems to AWS in addition to five detailed architectural patterns that complement SAP-supported mechanisms (such as OData, ODP, SLT, and BTP) using AWS services, SAP products, and AWS Partner Solutions.
Please note: [Disclaimer]
Architecture Diagram
Overview of Architecture Patterns
This architecture diagram shows the pattern options for ingesting SAP systems to AWS. For detailed architecture patterns, open the other tabs.
Step 1
SAP data hosted on SAP RISE, SAP HANA Cloud, AWS, or on-premises systems can be extracted in real-time or batch and full or incremental mode from SAP NetWeaver systems, such as SAP ERP Central ECC, SAP S/4HANA, or SAP BW.
It can also be extracted with SAP HANA Database using options such as:
A. AWS Managed Services
B. SAP and other AWS Partner Solutions with dedicated instances
C. AWS Partner Solutions embedded in SAP NetWeaver
Data Integration Options
Step A
AWS Glue, a serverless data integration service, and Amazon AppFlow SAP OData connector offer application-level data extraction.
Step B1
AWS Partner Solutions such as BryteFlow SAP Data Lake Builder and Qlik Replicate offer instance-based solutions for comprehensive data ingestion scenarios.
Step B2
Using SAP native integration, SAP Datasphere or SAP Data Services sends data to Amazon Simple Storage Service (Amazon S3) or Amazon Redshift.
Step B3
SAP SLT replication engine supports replicating data to Amazon Relational Database Service (Amazon RDS) using a database connection. AWS Partner Solutions such as Syntax CxLink support streaming data to Amazon S3 and Amazon Kinesis using ABAP add-on for SAP SLT.
Step C
AWS Partner Solutions embedded in SAP NetWeaver, such as SNP Glue, offer point-to-point data replication from SAP NetWeaver-based source system to the AWS Cloud.
Step 2
Data extracted from SAP can be landed in AWS services, such as Amazon S3, Amazon Redshift, Amazon Kinesis, or Amazon RDS, combined with non-SAP data, further processed, and analyzed using AWS analytics and generative AI services.
-
A. AWS Managed Services
This architecture diagram shows how to ingest SAP data to AWS using AWS Glue. For the other architecture patterns, open the other tabs.
Step 1
Use the following AWS Managed Services options to extract data from SAP:A. AWS Glue SAP PyRFC Library for application-level extraction (requires custom design for change data capture)
B. Amazon AppFlow SAP OData connector (in-built SAP ODP change data capture)
Design change data capture for AWS Glue-based methods,using appropriate fields that indicate changed records, such as data change timestamp.
Step 1a
Invoke SAP RFC to extract SAP data using SAP PyRFC library and AWS Glue Python modules.Step 1b
Configure the OData service in the source SAP system, configure the SAP connection using SAP OData connector for Amazon AppFlow, create a flow, and schedule the flow or run it on-demand to extract SAP data.Step 4
AWS Glue performs data transformation such as join, union, aggregate, filter, renaming field, dropping fields, adding timestamps, or custom transform.Step 5
AWS Secrets Manager stores credentials. AWS Identity and Access Management (IAM) is used for access management and role configurations.Step 6
Choose destination AWS services, such as Amazon S3, Amazon Redshift, or Amazon RDS as the data target. Data extracted from SAP can be combined with non-SAP data, further processed, and analyzed using AWS analytics and generative AI services. -
B1. AWS Partner Solution by BryteFlow
This architecture diagram shows how to ingest SAP data to AWS using the Partner Solution: BryteFlow SAP Data Lake Builder. For the other architecture patterns, open the other tabs.
Step 1a
For application-level data extraction, configure SAP OData services based on CDS views, BW extractors, BW InforProviders, or HANA information views.Step 1b
Database-level data extraction (which requires SAP license that allows database access) uses a trigger-based (SAP HANA database) or log-based mechanism (Oracle, SQL, DB2) to replicate data.Step 2
The AWS Partner Solution, BryteFlow SAP Data Lake Builder, provides application- and database-level SAP data extraction with change data capture to the AWS Cloud.BryteFlow SAP Data Lake Builder is available as a pre-configured Amazon Machine Image (AMI) on AWS Marketplace. Follow instructions to configure AMI on an Amazon Elastic Compute Cloud (Amazon EC2) instance.
Step 3
The captured initial and changed data is ingested by BryteFlow SAP Data Lake Builder software running on an EC2 instance to AWS analytics services.Append and update/insert (“upsert”) to Amazon S3, Amazon Redshift, and Amazon RDS are supported in this Guidance. Amazon S3 upsert operations require additional services, such as Amazon EMR and Amazon Elastic Block Service (Amazon EBS). Data catalog and portioning of the schema is configured.
Step 4
BryteFlow SAP Data Lake Builder uses IAM, AWS Key Management Service (AWS KMS), Amazon CloudWatch, and Amazon Simple Notification Service (Amazon SNS) for security, monitoring and alerts. -
B2: SAP Datasphere and Data Services
This architecture diagram shows how to ingest SAP data to AWS using SAP Datasphere or SAP Data Services. For the other architecture patterns, open the other tabs.
Step 1
Data from SAP ERP hosted on RISE, AWS, or on-premises can be extracted using:A. SAP Datasphere
B. SAP Data Services
SAP Datasphere
Step 2a
SAP Datasphere offers various connection types, such as SAP ABAP Connections, SAP ECC Connections, and SAP S/4HANA Cloud Connections, which support RFC and ODP protocols. Refer to SAP Datasphere documentation to choose most appropriate connectivity to extract SAP data.Step 2b
Using premium outbound integration for Amazon Simple Storage Connections, configure the SAP Datasphere replication flow to ingest data to Amazon S3.SAP Data Services
Step 3a
Install SAP Data Services on an Amazon EC2 instance or on-premises.Step 3b
SAP Data Services offers various connections to extract data from SAP ECC data. Refer to SAP Data Services documentation to choose most appropriate connectivity.Step 3c
SAP Data Services offers Amazon Redshift Datastore and Amazon S3 datastore to ingest data to AWS.Step 3d
SAP Data Services offers options for Amazon S3 file location protocol, such as encryption type, compression type, batch size, number of threads, Amazon S3 storage class, and more. -
B3: SAP SLT
This architecture diagram shows how to ingest SAP data to AWS using SAP SLT. For the other architecture patterns, open the other tabs.
Step 1
Configure RFC destination in SAP SLT to the source SAP ERP system.Step 2
Configure the SAP SLT database connection to the destination Amazon RDS server using a host name, username, and password. Configure SAP SLT mass transfer ID to replicate tables (initial and incremental data) in real-time or at a scheduled frequency to Amazon RDS.Step 3
Insert, update, and delete operations are supported to Amazon RDS, which can be used as a landing zone for subsequent data loads to Amazon S3 or Amazon Redshift.Step 4
For data replication to Amazon S3 or Amazon Kinesis, install an AWS Partner Solution ABAP add-on, such as Syntax CxLink Data Lakes on SAP SLT Server.Step 5
Syntax CxLink Data Lakes replicates data in real-time or at scheduled frequencies to Amazon S3 or Amazon Kinesis. Incremental data is appended to existing data. -
C: SAP NetWeaver Add-On Solution by SNP
This architecture diagram shows how to use SAP NetWeaver add-on solution SNP Glue to extract data from SAP to AWS. For the other architecture patterns, open the other tabs.
Step 1
Install and configure SNP Glue ABAP add-on on the SAP ABAP-based source system (such as S/4HANA, ECC, CRM, or BW) Netweaver 7.1 SP14 or higher.Step 2
SNP Glue configuration workbench allows selection of tables, modification of source and destination structures, data filtering, and addition of transformation rules.Step 3
SNP Glue scheduler allows creating flexible schedule and throttling SAP resources by limiting the maximum number of background work processes.Step 4
Initial and incremental data in addition to deletions are captured by SNP Glue and replicated to AWS services such as Amazon S3 and Amazon Redshift.
Get Started
Well-Architected Pillars
The AWS Well-Architected Framework helps you understand the pros and cons of the decisions you make when building systems in the cloud. The six pillars of the Framework allow you to learn architectural best practices for designing and operating reliable, secure, efficient, cost-effective, and sustainable systems. Using the AWS Well-Architected Tool, available at no charge in the AWS Management Console, you can review your workloads against these best practices by answering a set of questions for each pillar.
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
-
Operational Excellence
AWS CloudFormation automates the deployment process, while CloudWatch provides observability, tracking, and tracing capabilities. The entire solution can be deployed using CloudFormation, which helps automate deployments across development, quality assurance, and production accounts. This automation can be integrated into your development pipeline, enabling iterative development and consistent deployments across your SAP landscape.
-
Security
IAM secures AWS Glue and Amazon AppFlow through permission controls and authentication. These managed services access only specified data. Amazon AppFlow facilitates access to SAP workloads. Data is encrypted in transit and at rest. AWS CloudTrail logs API calls for auditing. S3 buckets and cross-region replication can store data. For enhanced security, run Amazon AppFlow over AWS PrivateLink with Elastic Load Balancing and SSL termination using AWS Certificate Manager.
-
Reliability
Amazon AppFlow and AWS Glue can reliably move large volumes of data without breaking it down into batches. Amazon S3 provides industry-leading scalability, data availability, security, and performance for SAP data export and import. PrivateLink is a regional service, and as part of the Amazon AppFlow setup using PrivateLink, you will set up at least 50 percent of Availability Zones in the Region (minimum two Availability Zones per Region), providing an additional level of redundancy for ELB.
-
Performance Efficiency
The SAP operational data provisioning framework captures changed data. Parallelization features in Amazon AppFlow and AWS Partner Solutions like BryteFlow and SNP enable customers to choose the number of parallel processes to run in the background, parallelizing large data volumes. Amazon S3 offers improved throughput with multi-part uploads through supported data integration mechanisms. The parallelization capabilities and seamless integration with Amazon S3 allow for efficient and scalable data ingestion from SAP systems into AWS.
-
Cost Optimization
By using serverless technologies like Amazon AppFlow or AWS Glue and Amazon EC2 auto scaling, you only pay for the resources you consume. To optimize costs further, extract only the required business data groups by leveraging semantic data models (for example, BW extractors or CDS views). Minimize the number of flows based on your reporting granularity needs. Implement housekeeping by setting up data tiering or deletion in Amazon S3 for old or unwanted data.
-
Sustainability
Data extraction workloads can be scheduled or invoked in real-time, eliminating the need for underlying infrastructure to run continuously. Using serverless and auto-scaling services is a sustainable approach for data extraction workloads, as these components activate only when needed. By leveraging managed services and dynamic scaling, you minimize the environmental impact of backend services. Adopt new options for Amazon AppFlow as they become available to optimize the volume and frequency of extraction.
Related Content
Replicate SAP to AWS in Real-Time with Business Logic Intact Using BryteFlow
Disclaimer
The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.
References to third-party services or organizations in this Guidance do not imply an endorsement, sponsorship, or affiliation between Amazon or AWS and the third party. Guidance from AWS is a technical starting point, and you can customize your integration with third-party services when you deploy the architecture.