AWS Storage Blog

Tag: Amazon EMR

Maximizing price performance for big data workloads using Amazon EBS

Since the emergence of big data over a decade ago, Hadoop ­– an open-source framework that is used to efficiently store and process large datasets – has been crucial in storing, analyzing, and reducing that data to provide value for enterprises. Hadoop lets you store structured, partially structured, or unstructured data of any kind across […]

Amazon S3 featured image - new

Run queries up to 9x faster using Trino with Amazon S3 Select on Amazon EMR

Customers building data lakes continue to innovate in the ways that they store and access their data. For these customers, performance is critical, particularly when they are accessing large amounts of data. For example, data scientists, data analysts, and data engineers running queries from open source frameworks like Trino want to accelerate access to their […]

AWS DataSync Featured Image 2020

How TMAP Mobility transferred 2.4 PB of Hadoop data using AWS DataSync

Launched in 2002, TMAP Mobility is Korea’s leading mobility platform, with 20 million registered users and 14 million monthly active users. TMAP provides navigation services based on a wide range of real-time traffic information and data. Previously, the Data Intelligence group at TMAP Mobility operated a mobility-data platform based on a Hadoop Distributed File System […]

AWS Storage Gateway Featured Image

CME Group accelerates cloud migration with AWS Storage Gateway

At CME Group, the world’s leading and most diverse derivatives marketplace, we offer futures and options across every investible asset class, from corn to Bitcoin. This breadth means our global, electronic markets are powered by data – and lots of it. Making sure that our customers have access to the market data that they need […]

AWS Outposts Featured Image

Connecting AWS Outposts to on-premises data sources

Millions of customers such as startups, enterprises, and leading government agencies are using AWS to lower costs, become more agile, and innovate faster. There are some workloads that must remain on-premises in order to interact with data that cannot, for variety of reasons, move to an AWS Region. Enter AWS Outposts. AWS Outposts is a […]

Migrate HDFS files to an Amazon S3 data lake with AWS Snowball Edge

The need to store newly connected data grows as the sources of data increase. Enterprise customers use Hadoop Distributed File System (HDFS) as their data lake storage repository for on-premises Hadoop applications. Customers are migrating their data lakes to AWS for a more secure, scalable, agile, and cost-effective solution. For HDFS migrations where high-speed transfer […]