Skip to main content
2025

Unlocking Faster, Cost-Effective Data Analytics Using Amazon Redshift with PayU

Learn how fintech company PayU created a unified data solution using Amazon Redshift, shortening query times to under 1 minute and reducing costs.

Key Results

Under 1 minute

to conduct queries that previously took 10–15 minutes

200 TB

of data scanned per day

Less than 1% failure rate

down from 7%–10%

$20,000

reduction in monthly costs

Overview

In a data-driven world, companies need accurate, timely data for reporting, analysis, and decision-making. Global payment solutions provider PayU improved access to such data for merchants and internal users by creating a centralized data environment on Amazon Web Services (AWS). Using Amazon Redshift—which uses SQL to analyze structured and semistructured data across data warehouses, operational databases, and data lakes—PayU improved the performance of its data analytics, streamlined its data environment, and reduced costs.

Now, using Amazon Redshift, PayU has a single source of truth, and its data consumers have reliable access to timely data and reports. “Amazon Redshift empowers our organization to share more reports and handle greater data volumes, helping leadership make better data-driven decisions,” says Priyank Yadav, director of data and engineering at PayU.

Person wearing a gray sweater working on a laptop at a desk in a bright, modern setting.

About PayU

PayU is an online payment provider that offers payment gateway solutions to businesses worldwide, serving more than 50 countries with over 100 payment methods. It is a preferred payment partner for e-commerce companies and airlines in India.

Opportunity | Using Amazon Redshift to Create a Single Source of Truth for PayU

Based in India, PayU is a global fintech company that provides online payment services in over 50 markets worldwide. The company serves more than 5 million merchants with over 100 payment methods.

Previously, PayU’s data stack was composed of numerous self-hosted MySQL databases in a web of multiple secondary nodes and read replicas. The environment, an online transactional processing system, had performance issues and slow query response times. The company began evaluating options for updating its data environment, looking for an online analytical processing solution.

PayU chose Amazon Redshift, which would natively integrate with the other services in its AWS environment. That meant that PayU could keep data in the same AWS Region without sending it through a third party, and Amazon Redshift offered better price performance than other solutions that PayU evaluated.

PayU began migrating to Amazon Redshift as a data warehouse solution for its data science team. After a successful proof of concept in 2020, PayU decided to use Amazon Redshift to build a centralized data solution that would be a single source of truth for the company, which it finished implementing in 2022.

Solution | Processing 35,000 Queries per Day While Reducing Costs

The unified data environment provides ease of access to over 350 registered users, who use the solution for reporting, production applications, machine learning (ML), and exploratory data analysis. Previously, accessing data meant obtaining permissions for numerous separate replicas.

To create a single source of truth, PayU built an extract, transform, load (ETL) pipeline using AWS services to load data into Amazon Redshift from around 40 production databases. (See Figure 1.) Data is sent to Amazon Managed Streaming for Apache Kafka (Amazon MSK), a service to securely stream data with fully managed, highly available Apache Kafka. Amazon MSK receives the data and sends it to Amazon EMR—a cloud big data solution—which stores it in Amazon Simple Storage Service (Amazon S3), an object storage solution. Data from Amazon S3 then goes to a central Amazon Redshift cluster.

PayU uses Amazon Redshift Data Sharing—which companies use to share data securely across warehouses without copying data—to help provide read workload isolation, governance, scaling, and seamless collaboration across multiple business intelligence or analytics clusters. For shared clusters, PayU uses Amazon Redshift RA3 instances with managed storage, instances that scale compute and storage independently. To provide fast query performance, lower costs, and reduced operational overhead, PayU uses Amazon Redshift Serverless, a service to get insights from data in seconds without having to manage data warehouse infrastructure. Using Amazon Redshift Serverless simplified cluster management and reduced costs by around 2,500 dollars per month.

PayU also uses other Amazon Redshift features to solve challenges and streamline performance. For example, PayU was the first company in India to use Amazon Redshift streaming ingestion, which generates near real-time insights through streaming data ingestion into data warehouses and data visualizations. This feature makes data that is ingested using Amazon MSK available for analysis in Amazon Redshift within seconds without needing to be stored in a relational database. It also uses materialized views, which let users achieve significantly faster query performance for iterative or predictable workloads such as dashboarding and ETL data processing jobs. “We were at the forefront of evaluating the latest AWS Redshift features. We have explored it from the ground up,” says Priyank.

Using Amazon Redshift, PayU’s data environment is more robust than before. The company’s current Amazon Redshift configuration has 5 clusters with a total of 18 nodes. Two clusters are ETL clusters for data processing and write workloads, and the other three are consumer clusters for read workloads. These consumer clusters include a cluster for exploratory data analytics, one for business reporting, and one for specialized data scientists. Together all five clusters create a cohesive data sharing environment. PayU deals with billions of records, scanning around 200 TB of data daily. In March 2024, the solution handled 150,000 queries per day. “These results would not have been possible in the previous cluster implementations,” says Priyank. The company then reduced this volume to 35,000 by rationalizing unnecessary queries. “In 1 month, we cut down queries by 77 percent, which would have been a 6-month exercise in the previous environment,” says Priyank.

The current environment is highly reliable, with a failure rate under 1 percent compared with a failure rate of 7–10 percent in PayU’s previous environment, where complex queries often got stuck and canceled. Queries that previously took 10–15 minutes now take less than 1 minute on Amazon Redshift, resulting in reports more quickly reaching merchants. Data is now available in under 30 minutes—and in some cases, such as data streaming for ML, in under 5 seconds—whereas previously data was updated only once per day. By implementing Amazon Redshift, PayU saved 20,000 dollars per month and reduced the time it took to manage and maintain the environment.

PayU uses Amazon Redshift to better understand and use its data. As a result, the data science team built an ML model to recommend payment gateways to consumers on merchant pages. The company can also observe potential fraud use cases more keenly, and it built an ML model to predict the authenticity of international transactions. “We are able to prevent fraudulent transactions from taking place,” says Priyank.

Outcome | Streamlining Performance Using Amazon Redshift Features

Going forward, PayU plans to expand its use of Amazon Redshift Serverless to improve performance at optimized cost. It is also looking to use zero-ETL integration to further reduce data refresh latency and eliminate pipeline management. “AWS has worked side-by-side with us and provided active support, which has been a major reason for us to use Amazon Redshift,” says Priyank.

Logo of PayU with "pay" in black and "U" in green, enclosed in a geometric shape.
Amazon Redshift empowers our organization to share more reports and handle greater data volumes, helping leadership make better data-driven decisions.

Priyank Yadav

Director of Data and Engineering, PayU

Architecture Diagram