Articles & Tutorials

Articles & Tutorials>Elastic MapReduce
Showing 1-15 of 15 results.
Sort by:

This tutorial is now deprecated. To learn more about Spark on Amazon EMR, click here.

This tutorial walks you through installing and operating Spark, a fast and general engine for large-scale data processing, on an Amazon EMR cluster. You will also create and query a dataset in Amazon S3 using Spark SQL, and learn how to monitor Spark on an Amazon EMR cluster with Amazon CloudWatch.
Last Modified: Jun 16, 2015 23:25 PM GMT

An internet advertising company operates a data warehouse using Hive and Amazon Elastic MapReduce. This company runs machines in Amazon EC2 that serve advertising impressions and redirect clicks to the advertised sites. The machines running in Amazon EC2 store each impression and click in log files pushed to Amazon S3.

Last Modified: Apr 29, 2015 17:47 PM GMT
Analyze your Amazon CloudFront Logs using Amazon Elastic MapReduce.
Last Modified: Dec 8, 2014 21:07 PM GMT
This article shows how to use EMR to efficiently export DynamoDB tables to S3, import S3 data into DynamoDB, and perform sophisticated queries across tables stored in both DynamoDB and other storage services such as S3.
Last Modified: Sep 26, 2013 0:23 AM GMT
MicroStrategy's business intelligence software and mobile app development platform are now available as a free software suite through an Amazon EC2 image on AWS Marketplace. The MicroStrategy Intelligence suite offers features for modeling, reporting and analyzing data, including in-memory cubes for high-performance analysis. By configuring the Free MicroStrategy Suite EC2 instance to connect to a Hive job flow running on Amazon Elastic MapReduce, you can create a secure and extensible platform for reporting and analytics.
Last Modified: Mar 29, 2013 23:37 PM GMT
In this tutorial, you will learn how to use Apache Hive to join ad impression logs with click-through logs to determine which advertisement a given user is most likely to click on. This article also demonstrates how to manage Amazon Web Services using Windows PowerShell.
Last Modified: Mar 6, 2013 0:05 AM GMT
The following tutorial walks you through the process of using Informatica's HParser hosted on Amazon EMR to process custom text files into an easy-to-analyze XML format.
Last Modified: Nov 21, 2012 4:43 AM GMT
A command-line tool for processing large data sets.
Last Modified: Jul 12, 2012 19:04 PM GMT
The MapR Hadoop distribution adds dependability and ease of use to the strength and flexibility of Hadoop. The Amazon Elastic MapReduce (EMR) service enables you to easily setup, operate, and scale MapR deployments in the cloud as well as integrate with other AWS services. Users can take advantage of hourly pricing with no up-front fees or long-term commitments.
Last Modified: Jun 12, 2012 23:45 PM GMT
This article gives an introduction to processing n-gram data using Amazon Elastic MapReduce and Apache Hive. In this example we will calculate the top trending topics per decade. The data used to find the topics comes from the Google Book's n-gram corpus.
Last Modified: Dec 23, 2010 22:31 PM GMT
This article and accompanying video walks through the steps for getting started with the AWS SDK for .NET, including installing the AWS SDK for .NET, creating new projects using project templates, running the packaged code samples, and getting help with development.
Last Modified: May 24, 2010 19:58 PM GMT
This article shows how to use Amazon Elastic MapReduce and Hive toprocess logs uploaded to Amazon S3 from a fleet of boxes which areserving online advertising. The logs are processed and the resultinginformation is stored in a collection of relational tables persistedin Amazon S3 and queryable using Hive. Summaries of the data arepushed to Amazon SimpleDB where they are accessible to monitoring tools.
Last Modified: Oct 2, 2009 0:37 AM GMT
Data Wrangling blogger and AWS developer Peter Skomoroch gives us an introduction to Amazon Elastic MapReduce. Peter Skomoroch is a consultant at Data Wrangling in Arlington, VA where he mines large datasets to solve problems in search, finance, and recommendation systems.
Last Modified: Apr 8, 2009 1:05 AM GMT
This example shows how to use Hadoop Streaming to count the number oftimes that words occur within a text collection.
Last Modified: Apr 2, 2009 20:53 PM GMT
CloudBurst provides highly-sensitive short read mapping with MapReduce.
Last Modified: Apr 2, 2009 20:53 PM GMT
Results per page:
©2017, Amazon Web Services, Inc. or its affiliates. All rights reserved.