Articles & Tutorials

Articles & Tutorials>Elastic MapReduce
Showing 1-25 of 31 results.
Sort by:
Analyze your Amazon CloudFront Logs using Amazon Elastic MapReduce.
Last Modified: Dec 8, 2014 21:07 PM GMT
This tutorial walks you through installing and operating Spark, a fast and general engine for large-scale data processing, on an Amazon EMR cluster. You will also create and query a dataset in Amazon S3 using Spark SQL, and learn how to monitor Spark on an Amazon EMR cluster with Amazon CloudWatch.
Last Modified: Nov 20, 2014 21:45 PM GMT
This tutorial shows you how to develop a simple, log parsing application using Pig and Amazon Elastic MapReduce. The tutorial walks you through using Pig interactively (via SSH) on a subset of your data, which enables you to prototype your script quickly. The tutorial then takes you through uploading the script to Amazon S3 and running on a larger set of input data.
Last Modified: Mar 20, 2014 15:30 PM GMT
Analyze your Apache logs using Pig and Amazon Elastic MapReduce.
Last Modified: Mar 20, 2014 15:27 PM GMT
This article shows how to use EMR to efficiently export DynamoDB tables to S3, import S3 data into DynamoDB, and perform sophisticated queries across tables stored in both DynamoDB and other storage services such as S3.
Last Modified: Sep 26, 2013 0:23 AM GMT
Learn how to build and deploy a Node.js application on Amazon EMR
Last Modified: Aug 22, 2013 21:28 PM GMT
This article will show you how to install and use Apache Accumulo with Amazon Elastic MapReduce.
Last Modified: Apr 11, 2013 18:52 PM GMT
MicroStrategy's business intelligence software and mobile app development platform are now available as a free software suite through an Amazon EC2 image on AWS Marketplace. The MicroStrategy Intelligence suite offers features for modeling, reporting and analyzing data, including in-memory cubes for high-performance analysis. By configuring the Free MicroStrategy Suite EC2 instance to connect to a Hive job flow running on Amazon Elastic MapReduce, you can create a secure and extensible platform for reporting and analytics.
Last Modified: Mar 29, 2013 23:37 PM GMT
In this tutorial, you will learn how to use Apache Hive to join ad impression logs with click-through logs to determine which advertisement a given user is most likely to click on. This article also demonstrates how to manage Amazon Web Services using Windows PowerShell.
Last Modified: Mar 6, 2013 0:05 AM GMT
The following tutorial walks you through the process of using Informatica's HParser hosted on Amazon EMR to process custom text files into an easy-to-analyze XML format.
Last Modified: Nov 21, 2012 4:43 AM GMT
A command-line tool for processing large data sets.
Last Modified: Jul 12, 2012 19:04 PM GMT
The MapR Hadoop distribution adds dependability and ease of use to the strength and flexibility of Hadoop. The Amazon Elastic MapReduce (EMR) service enables you to easily setup, operate, and scale MapR deployments in the cloud as well as integrate with other AWS services. Users can take advantage of hourly pricing with no up-front fees or long-term commitments.
Last Modified: Jun 12, 2012 23:45 PM GMT

An internet advertising company operates a data warehouse using Hive and Amazon Elastic MapReduce. This company runs machines in Amazon EC2 that serve advertising impressions and redirect clicks to the advertised sites. The machines running in Amazon EC2 store each impression and click in log files pushed to Amazon S3.

Last Modified: Feb 15, 2012 2:55 AM GMT
This page lists documentation resources specific to using Hive on Amazon Elastic MapReduce.
Last Modified: Feb 15, 2012 2:47 AM GMT
This tutorial shows how to use Karmasphere Analyst with Amazon Elastic MapReduce to analyze large data sets stored in Amazon S3.
Last Modified: Oct 28, 2011 0:30 AM GMT
This tutorial will show you how to use Karmasphere Studio to develop, debug and deploy Hadoop Jobs for Amazon Elastic MapReduce.
Last Modified: Oct 28, 2011 0:28 AM GMT
This article describes the Hive extensions that make Hive work more easily with Amazon Elastic MapReduce.
Last Modified: Oct 4, 2011 16:31 PM GMT
AWS infrastructure services are hosted in a number of regions, including locations in the US, Europe, and Asia Pacific. This article lists the web service API endpoints needed to make API requests and manage infrastructure in each region.
Last Modified: Sep 1, 2011 0:10 AM GMT
This article gives an introduction to processing n-gram data using Amazon Elastic MapReduce and Apache Hive. In this example we will calculate the top trending topics per decade. The data used to find the topics comes from the Google Book's n-gram corpus.
Last Modified: Dec 23, 2010 22:31 PM GMT
This article describes some helpful tips and tricks for developing applications on the AWS SDK for Java.
Last Modified: Sep 3, 2010 17:55 PM GMT
This document provides a quick guide on how to use Elastic MapReduce to develop, debug, and run job flows that have multiple steps.
Last Modified: Jul 9, 2010 19:35 PM GMT
This article and accompanying video walks through the steps for getting started with the AWS SDK for .NET, including installing the AWS SDK for .NET, creating new projects using project templates, running the packaged code samples, and getting help with development.
Last Modified: May 24, 2010 19:58 PM GMT
This article describes the differences between the AWS SDK for Java and previous Java libraries from Amazon Web Services.
Last Modified: Apr 8, 2010 21:30 PM GMT
This guide introduces the AWS SDK for Java and provides a walk-through for getting started using the SDK for both Eclipse and non-Eclipse users.
Last Modified: Apr 8, 2010 21:29 PM GMT
This video provides an introduction to the use of Apache Hive tooperate a data warehouse with Amazon Elastic MapReduce. It takes youthrough the development of Hive script using an interactive job flowand shows you how to deploy this script in Amazon S3 and how to runjob flows to execute the script in batch mode.
Last Modified: Mar 4, 2010 19:21 PM GMT
Results per page:
©2014, Amazon Web Services, Inc. or its affiliates. All rights reserved.