AWS Partner Network (APN) Blog

Next-Generation Data Integration with AWS Data Services and Dataddo

By Petr Nemeth, CEO – Dataddo
By Stanley Chukwuemeke, Senior Partner Solutions Architect – AWS

Dataddo logo
Connect with Dataddo

Data collecting, preparing, storing, and using data from a growing number of disparate systems is now a challenge for organizations in every industry. The application programming interfaces (APIs) and user interfaces (UI), of cloud services, are changing, requiring continuous adaptation of data pipelines by engineers, who may or may not can react quickly.

Reliance on engineers to manage these changes can lead to broken dashboards, gaps in datasets, and decisions based on out-of-date information. Businesses increasingly rely on data and AI, driving a need for early security and compliance solutions within the data lifecycle, starting with collection.

This post shows how organizations use Dataddo to efficiently and securely move data from one end of their data infrastructure to other. From cloud storage services in Amazon Web Services (AWS) like Amazon Simple Storage Service (Amazon S3) to Amazon Redshift, Amazon Aurora, or Amazon Relational Database (RDS).

Solution Overview

Dataddo provides hundreds of data connectors, enabling robust extract, transform, load (ETL), reverse ETL, and database replication capabilities that directly address the problem of integrating data from diverse systems. The no-code user interface is designed with business users in mind, but the platform enables developers to configure more complex workloads via code. Dataddo has optimized connectors for Amazon Redshift, Amazon S3, Amazon Aurora, and Amazon RDS. It can sync data from any service or database to these storages, as well as from these storages to any service or database. The platform eliminates the need for pipeline maintenance by AWS customers, as Dataddo’s engineers proactively monitor and maintain all pipelines, and manage API and interface changes.

Dataddo also has a suite of built-in features that effectively address data quality, compliance, and privacy challenges at the pipeline level, thus reducing the complexities and costs of working with data in AWS storage services and other downstream systems. These features include various transformation techniques, rule-based Data Quality Firewall, automatic detection of personal identifiable information (PII), with options for hashing and detailed monitoring and logging.

In addition to ETL, reverse ETL, and database replication, Dataddo supports direct integrations of applications with business intelligence (BI) tools, enabling business users to visualize important data without the intervention of engineers. The fully managed Dataddo platform, accessible to business users, bypasses engineering hurdles and quickens the delivery of data products.

Dataddo + AWS: Architecture

Dataddo is a key enabler of effective, end-to-end data integration for any organization using AWS storage services. In this section, we describe how Dataddo executes ETL, reverse ETL, and database replication workloads.

Extract, Transform, Load (ETL) / Extract, Load, Transform (ELT)

Dataddo can sync data from hundreds of sources to AWS storage services, including custom sources using its universal JSON connector. It supports API, file, database, and event connections. It can also sync batch, event, and database log files as shown in Figure 1.

Figure 1: Dataddo ETL/ELT

Dataddo offers robust pre-processing capabilities, including sensitive data detection and hashing, flattening, type harmonization, unions, and joins. It also ensures that extracted data is immediately usable, while the Data Quality Firewall (configurable per column) blocks the flow of anomalous data to AWS.To further safeguard consistency, Dataddo supports auto-schema creation, as well as multiple write modes (INSERT, UPSERT, DELETE, REPLACE).

Reverse ETL

Figure 2 shows how Dataddo enables users to sync data from AWS storage services to operational applications via reverse ETL. This provides business teams access to complex, custom-computed insights directly in the systems they use most.

Figure 2: Dataddo Reverse ETL

The data fields in sources can be easily mapped to destination equivalents through Dataddo’s no-code interface, while the built-in SQL console allows more technical AWS users to directly interact with their data using SQL queries. Dataddo ensures data consistency in destination applications through automatic data type harmonization and the configurable Data Quality Firewall. Dataddo supports multiple write modes, such as INSERT and UPSERT, for flexible synchronization of data to over 20 destination applications, including Customer Relationship Management (CRM), Enterprise Resource Planning (ERP) systems, and marketing automation tools.

Note: One interesting use of reverse ETL is syncing first-party data from AWS storage services to online ad platforms, for ultra-precise targeting; this is becoming an increasingly important supplement to the use of third-party data for online advertising, because it gives the platforms information about real conversions.

Database Replication

Dataddo can sync data between AWS storage services and any other databases, regardless of their underlying technology, through batch replication and various change data capture (CDC) methods as shown in Figure 3.

Figure 3: Dataddo Replication using CDC

All major on-premise databases and databases-as-a-service are supported as sources and destinations, including Amazon Redshift, Amazon S3, Aurora, and RDS. Connectors can extract batch, event, and database log files. Dataddo automatically converts data types during extraction, and automatically creates schemas during writing. Multiple write modes are supported, such as INSERT, UPSERT, DELETE, and REPLACE.

AWS Customer Stories

Dataddo has successfully integrated various third-party apps with AWS services for multiple customers, demonstrating valuable outcomes through three specific case studies.

Boldr

Boldr is a global outsourcing and offshore consulting company that needed an automated data integration solution to power internal reporting processes. They were spending nearly 14 hours a week monitoring 177 in-house data pipelines and resolving errors. The bulk of these were pipelines from their clients’ CRM tools to Google Sheets, as well as between Google Sheets and their database, Amazon RDS for PostgreSQL. They selected Dataddo for its straightforward, user-friendly interface and well-configured connectors (such as its universal JSON connector for syncing custom datasets). Additional considerations were its detailed notifications, proactive pipeline monitoring and maintenance.

Boldr was able to deploy Dataddo quickly, achieving the following key outcomes:

  • Virtually eliminated the need for pipeline maintenance
  • 5+ man-days saved monthly due to low error rate
  • Consolidated all company data into a central hub: Amazon RDS (PostgreSQL)
  • Improved data accuracy for reliable reporting

We now spend just a couple hours a week maintaining 177 pipelines, where we used to spend nearly 14.” – Natheer Maloon, Technology Solutions Manager, Boldr

ID&T

ID&T Group is an electronic music entertainment company known for organizing major electronic music festivals, like Defqon.1 and Mysteryland. Previously, all of the group’s brands and agencies were managing data slightly differently, and reporting back to ID&T manually. To solve this problem and get an accurate, unified overview of all their data, they decided to build a SaaS-based data infrastructure on AWS.As part of their new data infrastructure, they needed a tool that could pull data from the social media and advertising platforms of ID&T Group and all its partners, and send it to Amazon Redshift, RDS and Snowflake. They selected Dataddo for the reliability of its connectors, its maintenance-free pipelines, and its ability to harmonize data from disparate sources.By implementing Dataddo, ID&T achieved the following outcomes:

  • Elimination of errors associated with manual data collection
  • 2-3 man-days per week saved
  • Enhanced visibility into key performance metrics, like return on ad spend (ROAS), cost per click (CPC), and clickthrough rate (CTR)
  • Deeper understanding of online audience and increased revenue

Dataddo opens up gates and takes away the hurdles of working with data.” – Michael Guntenaar, CTO, ID&T Group

WWL

World Wide Lighting is a global ecommerce company that specializes in lighting solutions. They have 24 eshops and 4 subsidiaries that serve both European and Asian markets. They were in the process of modernizing their data infrastructure and required a tool that could pull data from their various eshops, applications, and proprietary software, and transfer it to Amazon Redshift. They chose Dataddo for its wide variety of off-the-shelf connectors and its willingness to build new connectors for WWL’s proprietary software. By implementing Dataddo, WWL was able to quickly centralize all data from siloed sources in Amazon Redshift, achieving the following outcomes:

  • A comprehensive view of business
  • Significantly reduced downtimes, due to managed connectors and proactive monitoring
  • Provision of reliable insights to over 50 decision-makers

Whenever we see something that we want to know more about, its easy for all the users to deep dive and take one step deeper in the data…to see a bit more of why things are happening.” – Jimmy van den Eerenbeemt, Insights Manager, WWL

Conclusion

Dataddo offers a streamlined data integration solution for AWS, featuring pre-built connectors and managed pipelines. Its user-friendly interface accommodates both technical and non-technical users, while built-in quality and compliance features reduce engineering overhead. This enables teams to focus on extracting value from data rather than managing complex integrations

Visit Dataddo’s AWS Marketplace page and sign up for a free trial to start moving data to and from AWS today.
.Connect with Dataddo


Dataddo – AWS Partner Spotlight

Dataddo is an AWS Differentiated Technology Partner and AWS Competency Partner that seamlessly integrates data from various business apps (Salesforce, SAP, Netsuite, advertising platforms), databases (cloud and on-prem), files, and custom APIs into AWS (Redshift, S3, Aurora, RDS). Beyond ETL/ELT, we offer DB cross-technology replication, reverse ETL, and proactive data pipeline monitoring, enhancing your data strategy within the AWS ecosystem.

Contact Dataddo | Partner Overview | AWS Marketplace