Posted On: Nov 19, 2019
AWS Step Functions is now integrated with Amazon EMR, making it faster to build and easier to monitor EMR big data processing workflows.
AWS Step Functions allows you to build resilient workflows using AWS services such as Amazon EMR, Amazon SageMaker, and AWS Lambda. Amazon EMR is the industry leading cloud-native big data platform, allowing teams to process vast amounts of data quickly, and cost-effectively at scale. With Step Functions and Amazon EMR, you can orchestrate big data workflows while writing minimal additional code.
With Amazon EMR and AWS Step Functions, you can now create efficient data processing workflows that order Amazon EMR steps, manage dependencies and run work in parallel. You can proactively scale a cluster up and down as part of an ETL workflow, right-sizing the cluster for the task at hand. You can also improve the resilience of your data processing workflows by choosing how exceptions are handled, retrying failed jobs and alerting users to failures.
By using the Amazon EMR service integration with the AWS Step Functions Data Science SDK you can build end-to-end data science workflows. The EMR service integration is available in all regions where both AWS Step Functions and Amazon EMR are available. For a complete list of regions and service offerings, see AWS Regions.
To get started, review the documentation and deploy a one-click sample project that demonstrates how to build a data processing workflow with Amazon EMR, then start building your first data processing workflow.
To learn more:
- Read the AWS News Blog post
- Deploy a one-click sample project for the AWS Step Functions integration with Amazon EMR
- Read about Managing Amazon EMR jobs with Step Functions in the AWS Step Functions Developer Guide