AWS Blog

Hue – A Web User Interface for Analyzing Data With Elastic MapReduce

by Jeff Barr | on | in Amazon Elastic MapReduce, Hadoop | | Comments

Hue is an open source web user interface for Hadoop. Hue allows technical and non-technical users to take advantage of Hive, Pig, and many of the other tools that are part of the Hadoop and EMR ecosystem. You can think of Hue as the primary user interface to Amazon EMR and the AWS Management Console as the primary administrator interface.

I am happy to announce that Hue is now available for Amazon EMR as part of the newest (version 3.3) Elastic MapReduce AMI. You can load your data, run interactive Hive queries, develop and run Pig scripts, work with HDFS, check on the status of your jobs, and more.

We have extended Hue to work with Amazon Simple Storage Service (S3). Hue’s File Browser allows you to browse S3 buckets and you can use the Hive editor to run queries against data stored in S3. You can also define an S3-based table using Hue’s Metastore Manager.

To get started, you simply launch a cluster with the new AMI and log in to Hue (it runs on the cluster’s master node). You can save and share queries with your colleagues, you can visualize query results, and you can view logs in real time (this is very helpful when debugging).

Hue in Action
Here are some screen shots of Hue in action. The main page displays all of my Hue documents (Hive queries and Pig scripts):

I can click on a document to open it up in appropriate query editor:

I can view and edit the query, and then run it on my cluster with a single click of the Execute button. After I do this, I can inspect the logs as the job runs:

After the query runs to completion I can see the results, again with one click:

I can also see the results in graphical (chart) form:

I have shown you just a couple of Hue’s features. You can read the Hue Tutorials and the Hue User Guide to learn more.

You can launch an EMR cluster (with Hue included) from the AWS Management Console. You can also launch it from the command line like this:

$ aws emr create-cluster --ami-version=3.3.0 \
  --applications Name=Hue Name=Hive Name=Pig \
  --use-default-roles --ec2-attributes KeyName=myKey \
  --instance-groups \
  InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m3.xlarge \
  InstanceGroupType=CORE,InstanceCount=2,InstanceType=m1.large

Hue for You
Hue is available on version 3.3 and above of the Elastic MapReduce AMI at no extra cost. It runs on the master node of your EMR cluster.

Jeff;