AWS Partner Network (APN) Blog

Accelerate SAP Data Replication with AWS and Fivetran

By Tamara Astakhova, Sr. Partner Solutions Architect – AWS
By BP Yau, Sr. Partner Solutions Architect – AWS
By Amit Kapadia, Lead Product Marketing Manager – Fivetran
By Edwin Commandeur, Principal Field Product Manager – Fivetran

Logo - Fivetran - APN Blog
Connect with Fivetran

With increasing focus on IoT, AI/ML and generative AI drive business transformation, SAP data provides an unified view of business-critical information by streamlining operations, improving decision-making, and driving innovation. The key challenge for enterprises is determining their effective strategies to manage and utilize this data in order to unlock its potential competitive advantage. Instead of spending valuable resources on building and maintaining data pipelines, organizations can focus on uncovering insights from data to move the business forward and achieve strategic goals faster.

Fivetran is a fully managed data platform with 700+ pre-built data sources, including SAP. The fully automated, scalable platform helps customers securely, reliably, and efficiently replicate data into any cloud destination — whether deployed on-premises, in the cloud or in a hybrid environment.

Fivetran is an AWS Data and Analytics Competency Partner and AWS Marketplace Seller with service specializations in Amazon Redshift, Amazon Relational Database Service (Amazon RDS), and AWS PrivateLink.

In this blog, we explore optimizing data management by leveraging a modern data architecture on AWS, and discuss how Fivetran automated SAP data replication accelerates enterprise processes and unlocks insights.

Fivetran data movement for SAP

Fivetran is investing and aligning with AWS zero-ETL future to simplify data architecture and reduce data engineering efforts. Fivetran’s fully managed SAP ERP on HANA connector replicates SAP ECC and S/4 HANA data from HANA platform into AWS securely, reliably and efficiently using low-impact change data capture (CDC). This provides real-time data access for analytics, AI/ML, and reporting. Fivetran helps your organization simplify your SAP data movement by offering:

  • Support for runtime and enterprise licenses – Fivetran can connect to SAP’s application layer on Oracle and HANA rather than directly accessing the database.
  • Direct and native integration – Direct native integration from SAP ERP to your destination, whether it’s data warehouse, data lake or other destination.
  • SAP ERP managed cloud sources – Fast implementation, easy to access data, SAP Rise & SAP HEC supported.
  • Support for special SAP data types and tables – Handle Cluster/Pool tables, Support long text, Transforming SAP Dates, Loading CDS Views, SAP Archiving support.
  • Source from Core Data Services (CDS) View.

Organizations are moving their data, including SAP, into a data lake in AWS where they can have a single source of truth to apply ML and analytics. They build data lakes in AWS on Amazon Simple Storage Service (Amazon S3). At the same time, organizations are leveraging purpose-built databases and services that are optimized for specific use cases. Migrating workloads to the modern data architecture on AWS enables organizations to securely move data between data lakes and purpose-built databases and services. This allows them to gain business insights faster.

Fivetran in action

Figure 1 below shows a reference architecture leveraging Fivetran for SAP data movement following the modern data architecture on AWS. Central to the design is AWS Lake Formation, which accelerates setting up data lakes. Amazon S3 acts as both the starting point and a landing zone for all data for a data lake. The data foundation of SAP data can live in the data lake and utilize the AWS purpose-built databases and services such as the following:

Amazon Aurora: Amazon Aurora is a modern relational database service offering performance and high availability at scale, fully open-source MySQL- and PostgreSQL-compatible editions, and a range of developer tools for building serverless and machine learning (ML)-driven applications.

Amazon EMR: Amazon EMR is the industry-leading cloud big data platform for data processing, interactive analysis, and machine learning using open-source frameworks such as Apache Spark, Apache Hive, and Presto. With EMR you can run petabyte-scale analysis at less than half of the cost of traditional on-premises solutions and over 1.7x faster than standard Apache Spark.

Amazon Redshift: Tens of thousands of customers use Amazon Redshift every day to run SQL analytics in the cloud, processing exabytes of data for business insights. Whether data is stored in operational data stores, data lakes, streaming data services or third-party datasets, Amazon Redshift helps securely access, combine, and share data with minimal movement or copying. Amazon Redshift is deeply integrated with AWS database, analytics, and machine learning services to employ Zero-ETL approaches or help you access data in place for near real-time analytics, build machine learning models in SQL, and enable Apache Spark analytics using data in Amazon Redshift.

Amazon SageMaker: Amazon SageMaker is a fully managed service to prepare data and build, train, and deploy machine learning (ML) models for any use case with fully managed infrastructure, tools, and workflows.

This allows for unified analytics across operational, data warehouse, and data lake environments, streamlining data access and analysis. The architecture empowers users of all levels to leverage the best-fit analytics services for their needs.

Fivetran SAP architecture

 Figure 1 – Fivetran Data Movement for SAP workloads architecture

The data flow below describes how data is moved and processed:

  1. Fivetran supports connecting numerous SAP product offerings across a multitude of deployment topologies (on-premises, cloud, SaaS) and license types (runtime, enterprise).
  2. Fivetran’s data movement platform flows data in batch, micro batch, or continuous real-time mode (you determine the intervals) into the AWS Cloud. The platform provides dataset optimizations and data preparation including table creation, tables updates, schema drift management, compaction for better performance, deduplication for data quality, normalization for understandable and organized data, anonymization (hash or block data before it enters the data pipeline and fully encrypt data from start to finish), and automated Personally Identifiable Information (PII) detection. Streamlined, efficient data replication into Amazon Redshift. Conversion to Parquet file format and persistence to Iceberg table format for Amazon S3.
  3. Fivetran’s normalized dataset (controlled, private, secure, and governed) is delivered into Amazon Redshift and Amazon S3 as well as AWS hosted services such as Amazon RDS, Snowflake, Databricks, Apache Kafka, etc. in standardized table format providing data correctness and consistency, SQL queries and analytics optimization. All datasets can then be immediately queried, aggregated or transformed.
  4. Metadata is populated into AWS Glue for table definitions, comprehensive granular access control and GDPR/CCPA compliance when using Amazon S3.
  5. Data in governed tables allows for native consumption by a variety of AWS Analytics & Insights services and is ready for immediate querying, aggregation, or transformation.
  6. Quickly deliver a wide range of industry-specific use cases with a modern approach to semi-structured file support, optimized and governed storage, and automated data preparation.

From Legacy to Leading Edge: Pitney Bowes’ SAP Integration Success Story

Pitney Bowes is a global shipping and mailing company that provides technology, logistics, and financial services to over 90 percent of the Fortune 500. They were struggling with legacy ELT inefficiencies, coupled with a growing demand to integrate SAP data for finance and sales analytics.

Fivetran replaced Pitney Bowes’ custom batch scripts and extract, transform, load (ETL) processes. The Enterprise Information Management team used Fivetran’s out-of-the-box connectors to quickly setup pipelines for several business-critical apps like SAP, Salesforce, Facebook, Apache Kafka, and Amazon Kinesis.

The centralization of data with Fivetran’s high performance, reduced batch load times by 94 percent from 31 hours to under 2 hours. This enhanced data infrastructure provided near real-time visibility and analytics for 16 global distribution centers and 800M+ packages per day. The data-driven approach helped the company improve its delivery estimate accuracy to 93 percent, effectively addressing previous challenges with missed delivery service-level agreements (SLAs) and creating a new revenue-generating guaranteed delivery program.

Traditional ERP data management has been a bottleneck for business innovation, consuming valuable resources while delivering limited insights. Fivetran’s purpose-built SAP connector eliminates this friction through automated, low-impact data replication that reduces data movement time by up to 94% while ensuring enterprise-grade security and reliability. When combined with AWS infrastructure, organizations transform raw SAP data into actionable insights that power real-time decision making, predictive analytics, and AI/ML initiatives.

Learn more about Pitney Bowes case study here.

Conclusion

Utilizing AWS infrastructure and Fivetran’s fully-managed SAP ERP on HANA connector enables enterprises to automatically replicates SAP data into a modern data architecture on AWS. This facilitates accelerating actionable insights from your SAP data to drive business growth. More detailed information on how to use Fivetran’s SAP ERP on HANA connector can be found in this documentation.

Sign up for a free 14-day trial of Fivetran, learn more about Fivetran in AWS Marketplace and try Fivetran SAP Data Model to load and transform SAP data into Amazon Redshift.

Connect with Fivetran


Fivetran – AWS Partner Spotlight

Fivetran is an AWS Advanced Technology Partner and fully managed data movement platform that automatically ingests and centralizes data from hundreds of sources into ready-to-analyze schemas.

Contact Fivetran | Partner Overview | AWS Marketplace