AWS Big Data Blog

Senthil Kamala Rathinam

Author: Senthil Kamala Rathinam

Automate and orchestrate Amazon EMR jobs using AWS Step Functions and Amazon EventBridge

In this post, we discuss how to build a fully automated, scheduled Spark processing pipeline using Amazon EMR on EC2, orchestrated with Step Functions and triggered by EventBridge. We walk through how to deploy this solution using AWS CloudFormation, processes COVID-19 public dataset data in Amazon Simple Storage Service (Amazon S3), and store the aggregated results in Amazon S3.