Paytm

Paytm Modernizes Data Platform and Streamlines Data Processing Using Amazon EMR

2021

Paytm, a pioneer in digital financial services, is India’s largest digital payments, commerce, and financial services platform. Today, it, supports over 17 million merchants and is used by millions of individuals daily to pay for utilities, groceries, movie tickets, and more. The company is on a mission to help to establish the creditworthiness of half a billion underbanked businesses and individuals across India.

“Our goal is help people gain control of their finances, whether it’s providing loans to businesses, digital savings accounts for the underbanked, or payment methods for neighbourhood stores. We want to be the go-to financial services platform in India,” says Manmeet Dhody, chief technology officer of payments at Paytm.
MPL
kr_quotemark
As the provisioning of capacity and scaling of clusters is managed by Amazon EMR, we can now deliver data to our business users 30 percent faster and at 70 percent the cost of our on-premises solutions. Even more so, AWS supported our motivated data engineering team to expedite and complete this complex and critical system migration in record time—in under 45 days, instead of a few quarters.

Manoj Kumar
Vice President & Head of Data Platform, Paytm

 

Future Proofing the Data Platform

With an uptick in digital payment services, Manoj Kumar, vice president and head of data platform at Paytm, foresaw data volumes expanding quickly. This inspired him to build a modern, scalable, and high-performing core data platform for Paytm.

Paytm’s business units and the merchants on its platform rely on the insights derived by the company’s big data platform—such as user adoption, sales, revenue generated—to inform business decisions. “As our data volumes grew, we needed a platform that could handle larger data workloads and provide our merchants as well as our product and business teams with the right data at the right time,” says Kumar.

To that end, Kumar and his team turned to Amazon Web Services (AWS) to rearchitect its on-premises platform on the cloud. With AWS, the team found that they could leverage AWS managed service solutions—allowing them to spend less time on running the on-premises platform—and build a solid foundation for a modern data lake to improve its data infrastructure further. 

Migrating the Data Platform to Amazon EMR

Paytm had two main challenges with its on-premises data infrastructure—performance and scalability. The company was running its core extract, transform, load (ETL) processes on on-premises big data clusters, to process and power its data analytics and reporting. However, the platform could take hours to process large data workloads which affected the ability of its business users, including its product managers and merchants, to make timely business decisions. Additionally, horizontally scaling the platform to meet capacity demands was not economical, as acquiring hardware could take up to months.

To address these challenges, Paytm’s data engineering team adopted Amazon EMR, a big data platform, to rearchitect its core ETL processing with low operational overheads.

Amazon EMR’s compatibility with Paytm’s pre-existing open source tools made it easy to set up, operate, and scale the company’s big data platform and integrate with its other machine learning and artificial intelligence stack.

With Amazon EMR, Paytm can now securely process and hyperscale data workloads with ease—the platform can spin up big data clusters and execute most of Paytm’s core ETL processing in as little as 10 minutes, down from up to 12 hours previously. Additionally, it can shut down clusters when they are no longer needed, minimizing unnecessary infrastructure costs.

“Amazon EMR provided us with exactly the tools and features we needed to build a futureproof data platform. As the provisioning of capacity and scaling of clusters is managed by Amazon EMR, we can now deliver data to our business users 30 percent faster and at 70 percent the cost of our on-premises solutions. Even more so, AWS supported our motivated data engineering team to expedite and complete this complex and critical system migration in record time—in under 45 days, instead of a few quarters.” says Kumar.

Empowering the Data Team

Prior to the migration, Kumar felt that running the on-premises data platform was preventing his team of data engineers from doing what they do best—engineering solutions and innovating with data. As Paytm’s on-premises data platform frequently ran into processing delays, the team spent majority of their time re-running jobs and ensuring that the platform was up and running. Migrating to Amazon EMR enabled the team to build a new and robust data processing framework, which significantly improved their working experience. Furthermore, as most of Paytm’s core ETL processing can now be done in minutes, the team is encouraged to explore new ways to further simplify and make other business processes more efficient.

“Rearchitecting our platform on Amazon EMR empowered our data engineering team to move away from resolving platform incidences and focus more on Patym’s core business. In fact, we are already thinking about making further improvements with AWS. With our data volumes expected to grow quickly, our next step is to integrate all our data into a single data lake, allowing us to maintain the quality of our data in line with business growth,” concludes Kumar.

About Paytm

Paytm is the consumer brand of India’s leading mobile internet company, One97 Communications. The brand is one of India’s largest financial services companies, offering full-stack payments and financial solutions to consumers, offline merchants, and online platforms. Today, the company serves millions of merchants and customers on its platform in India.

Benefits of AWS

  • Reduced infrastructure management and processing incidents by 70 percent
  • Streamlined data processing time for majority workloads by 98 percent
  • Improved data availability by 30 percent
  • Reduced data infrastructure cost by 30 percent

AWS Services Used

Amazon EMR

Amazon EMR is the industry-leading cloud big data platform for processing vast amounts of data using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto.

Learn more »


Get Started

Companies of all sizes across all industries are transforming their businesses every day using AWS. Contact our experts and start your own AWS Cloud journey today.