AWS Big Data Blog

Ravi Itha

Author: Ravi Itha

Ravi Itha is a Sr. Data Architect at AWS. He works with customers to design and implement Data Lakes, Analytics, and Microservices on AWS. Outside of work, he is passionate about books, cooking, movies, and yoga.

Operational Data Processing Framework for Modern Data Architectures

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS has invested in native service integration with Apache Hudi and published technical contents to enable you to use Apache Hudi with AWS Glue (for example, refer to Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 1: Getting Started). In AWS ProServe-led customer engagements, the use cases we work on usually come with technical complexity and scalability requirements. In this post, we discuss a common use case in relation to operational data processing and the solution we built using Apache Hudi and AWS Glue.