AWS Startups Blog

How Datacoral Uses AWS to Automate Data Pipelines and Create Fast, Easy Insights From Any Source

Data is everywhere. That should be great news for all the businesses looking to pick it apart for valuable insights, but Datacoral’s data scientist Rishabh Bhargava knows that everywhere is too overwhelming.

“Organizations today want to use data to make better decisions for their businesses. However, their data lives in many different places, such as databases, SaaS applications, file systems, etc. And this data is changing rapidly,” he says in Datacoral’s AWS Startup Architecture video submission.

That’s what inspired him to join Datacoral, a San Francisco based startup that specializes in helping data-driven companies make sense of their data. Rather than having to manually sort through data or risk garnering info from outdated intelligence, Datacoral’s data engineering platform allows users to quickly, securely, and reliably, extract data from several different sources into the data warehouse of their choice—even as that data is changing.

From there, they can easily conduct analytics on their complete dataset and trust they are working from the cleanest possible data. The end result is a more comprehensive and up-to-date understanding of how to enhance their products, drive sales, or streamline customer acquisition.

One company that recently utilized the platform is a healthcare technology company, DrChrono. They wanted to understand COVID-19’s impact on their churn rate but were having trouble combining and comparing data from multiple sources, including product usage data, sales data from Salesforce, and Zendesk customer support information.

So, they used Datacoral’s MySQL change data capture connector along with the Salesforce and Zendesk connectors to replicate data in near real-time into Redshift. From there, they could combine data from different sources, use it to test multiple hypotheses about their churn rates, and ultimately produce more accurate and intelligent sales and marketing projections for 2021.

Other businesses are following suit with Datacoral as their choice. Automating data replication using change data capture without Datacoral’s connectors is complicated and risky from a security perspective. Datacoral has designed an architecture that is installed in your VPC to guarantee that sensitive data never leaves your environment. Plus, thanks to customizable data quality checks, the freshness and quality of the data is maintained throughout the process.

Overall, the company’s customers have used connectors to replicate more than 400 billion records per month, while gaining full visibility and observability of all the data and metadata they replicate.

That’s not an easy task for Datacoral to achieve, especially when providing a platform that allows for seamless integration and the rapid scaling necessary for shifting data volumes. But AWS makes it possible. Datacoral uses the platforms’ serverless services such as AWS Lambda to deliver large volumes of data in manageable micro batches.

“We are able to offer pay-as-you-go pricing for a serverless architecture deployed in our customer’s VPC,” says Bhargava. “This ensures complete security and scalability.”

That way, data doesn’t have to be scattered across platforms. Thanks to Datacoral and AWS, it’s available to consumers in one, easy-to-access environment, ready to be crafted into actionable insight.