AWS Official Blog

Debug your Elastic MapReduce job flows in the AWS Management Console

by Jeff Barr | on | in Amazon Elastic MapReduce | | Comments

We are excited to announce that weve added support for job flow debugging in the AWS Management Console making Elastic MapReduce even easier to use for developing large data processing and analytics applications.  This capability allows customers to track progress and identify issues in the steps, jobs, tasks, or task attempts of their job flows.  The job flow logs continued to be saved in Amazon S3 and now the state of tasks and task attempts is persisted in Amazon SimpleDB so customers can analyze their job flows even after theyve completed.  

Very simple steps to start the debugging process:

Step 0: Select “Enable Hadoop Debugging” in the Create New Job Flow wizard.

2-3-2010 8-28-38 AM 
Step 1: Select the Job Flow and click on “Debug”

Copy of 2-1-2010 10-27-16 PM  

Step 2: Job Flow has one or more steps. You can view the log files of a specific step within your job flow.

Copy of 2-1-2010 10-29-46 PM 

Step 3: Each step might have one or more Hadoop jobs. Click on “View Tasks” next to a job to drill down and see different Hadoop tasks within that job.

Copy of 2-1-2010 10-30 PM 

Step 4: You will a see the list of tasks that have failed. Click on task that you would like to debug and view the task attempts.

Copy of 2-1-2010 10-31 PM

Step 5: You can click on a specific task attempt and see the log files associated with that particular task attempt

Copy of 2-1-2010 10-32 PM

Step 6: Discover the error.

Copy of 2-1-2010 10-33-22 PM

With just a few clicks in the AWS management console, you can find the error in your Elastic MapReduce job flows amongst thousands of jobs and restart the job flow again with a few clicks.

— Jinesh