Big Data on AWS
Drive innovation through data, with scalable services for data collection, storage, integration, analytics and collaboration.
“Amazon Elastic MapReduce enables us to focus on our Hadoop-based analysis without worrying about the underlying infrastructure.”
- Jason Davis, Director of Search & Personalization
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.
“AWS gave use the ability to bring a massive amount of capacity online in a short period of time.”
- Jason Titus, Chief Technology Officer
“Using Amazon Elastic MapReduce we were able to save $55,000 in upfront hardware costs and get up and running in a matter of days, not months.”
- Jim Blomo, Engineering Manager - Data-Mining
Capabilities
Big Data: end to end
“Big Data” refers to a collection of tools, techniques and technologies for working with data productively, at any scale. The tools to support data collection, computation along with collaboration and sharing are all available in a couple of clicks, with Amazon Web Services.
Built for data. Designed for humans.
It feels like everything generates data today, from your customers on social networks, to the instances running your web applications. Amazon Web Services makes it easy to provision the storage, computation and database services you need to turn that data into information for your business.
Hadoop all the way down.
Amazon Elastic MapReduce provides a managed, easy to use analytics platform built around the powerful Hadoop framework. Focus on your map/reduce queries and take advantage of the broad ecosystem of Hadoop tools, while deploying to a high scale, secure infrastructure platform.
Solid state, at your service.
NoSQL data stores benefit greatly from the speed of solid state drives. DynamoDB uses them by default, but if you are using alternatives from the AWS Marketplace, such as Cassandra or MongoDB, accelerate your access with on-demand access to terabytes of solid state storage, with the High I/O instance class.
Name-your-price supercomputing.
How fast could your project go with another 1000 instances? How about 10,000? The Amazon Spot Market, integrated into Amazon Elastic MapReduce, lets you choose your own price for the computing resources you need. That means you can choose your own balance of cost and performance, overclocking your analytics when you need to, or reducing costs significantly.
A growing family of datasets, ready to roll.
Many of the datasets you need are already available on the AWS Cloud, as part of the Amazon Public Datasets program. So whether you’re looking to mine the Common Crawl open web corpus, or align some genomes, AWS provides the data, the services and the infrastructure you need to get up and running.

How can data drive your business forward?
Log analytics
Dealing with large number of application or web logs can often feel like finding a needle in a haystack. Storing those logs in Amazon S3 and driving analytics with Amazon Elastic MapReduce provides a low cost, high performance, easy way to sift through terabytes of data. You’ll find the greater insight into your customer usage and operations opens new doors to respond quickly to changing market conditions.
Customer segmentation
Learn how to increase the click-through rates of your advertising, or drive better engagement in your social game world by gaining insight into your customers. Integrating data from various sources to help drive more relevant search results, better placed ads or balance your in-game economy.
Recommendation engines
Whether you need to recommend ‘books that other people bought’, or deliver more accurate suggestions for investment based on market performance, recommendation engines can take your existing and historical data and drive new business opportunities and features.
Getting Started
What is Big Data?
Introduce yourself to modern data analytics, with the Six Principles of Big Data. Six guidelines to answer questions such as ‘what is big data?’, and help to start quickly driving innovation through your datasets.
Building applications with Elastic MapReduce
Using real world data and examples, take a guided tour through Hadoop and Amazon Elastic MapReduce to learn how to put you data to work.
Query terabytes of data in real time with Hadoop
For those familiar with data analytics and Hadoop, learn more about using Spark, Mesos and Shark to drive real time insight into your applications, customers and business.
Partners
Think Big Analytics
Introduce yourself to modern data analytics, with the Six Principles of Big Data. Six guidelines to answer questions such as ‘what is big data?’, and help to start quickly driving innovation through your datasets.
Marketshare
Using real world data and examples, take a guided tour through Hadoop and Amazon Elastic MapReduce to learn how to put you data to work.






