Videos

A technical introduction to Amazon EMR (50:44)
Amazon EMR deep dive & best practices (49:12)

Stay up to date with AWS webinars.

Tutorials

Spark

Real-time stream processing using Apache Spark streaming and Apache Kafka on AWS

Learn how to set up Apache Kafka on EC2, use Spark Streaming on EMR to process data coming in to Apache Kafka topics, and query streaming data using Spark SQL on EMR.

Large-scale machine learning with Spark on Amazon EMR

Learn how Intent Media used Spark and Amazon EMR for their modeling workflows.

HBase

Low-latency SQL and secondary indexes with Phoenix and HBase

Learn how to connect to Phoenix using JDBC, create a view over an existing HBase table, and create a secondary index for increased read performance

Using HBase with Hive for NoSQL and analytics workloads

Learn how to launch an EMR cluster with HBase and restore a table from a snapshot in Amazon S3

Presto

Launch an Amazon EMR cluster with Presto and Airpal

Learn how to set up a Presto cluster and use Airpal to process data stored in S3.

Hive

Using HBase with Hive for NoSQL and analytics workloads

Learn how to launch an EMR cluster with HBase and restore a table from a snapshot in Amazon S3.

Process and analyze big data using Hive on Amazon EMR and MicroStrategy Suite

Learn how to connect to a Hive job flow running on Amazon Elastic MapReduce to create a secure and extensible platform for reporting and analytics.

This tutorial outlines a reference architecture for a consistent, scalable, and reliable stream processing pipeline that is based on Apache Flink using Amazon EMR, Amazon Kinesis, and Amazon Elasticsearch Service.

Learn at your own pace with other tutorials.

Training and help

Short term engagements

Do you need help building a proof of concept or tuning your EMR applications? AWS has a global support team that specializes in EMR. Please contact us if you are interested in learning more about short term (2-6 week) paid support engagements.

AWS big data training

The Big Data on AWS course is designed to teach you with hands-on experience on how to use Amazon Web Services for big data workloads. AWS will show you how to run Amazon EMR jobs to process data using the broad ecosystem of Hadoop tools like Pig and Hive. Also, AWS will teach you how to create big data environments in the cloud by working with Amazon DynamoDB and Amazon Redshift, understand the benefits of Amazon Kinesis, and leverage best practices to design big data environments for analysis, security, and cost-effectiveness. To learn more about the Big Data course, click here.

Additional training

Scale Unlimited offers customized on-site training for companies that need to quickly learn how to use EMR and other big data technologies. To find out more, click here.

Getting started tutorial using Amazon EMR
Getting started tutorial

Create a sample Amazon EMR cluster in the AWS Management Console.

Learn more 
Sign up for a free AWS account
Sign up for a free account

Instantly get access to the AWS Free Tier. 

Sign up 
Start building with EMR in the console
Start building in the console

Get started building with Amazon EMR in the AWS Console.

Sign in 

Discover more Amazon EMR resources

Visit the resources page
Ready to build?
Get started with Amazon EMR
Have more questions?
Contact us