AWS Storage Blog

Category: Analytics

AWS DataSync Featured Image 2020

Migrate on-premises data to AWS for insightful visualizations

When migrating data from on premises, customers seek a data store that is scalable, durable, and cost effective. Equally as important, BI must support modern, interactive, and fast dashboards that can scale to tens of thousands of users seamlessly while providing the ability to create meaningful data visualizations for analysis. Visualization of on-premises business analytics […]

S3 Security

Disabling ACLs for existing Amazon S3 workloads with information in S3 server access logs and AWS CloudTrail

Access control lists (ACLs) are permission sets that define user access, and the operations users can take on specific resources. Amazon S3 was launched in 2006 with ACLs as its first authorization mechanism. Since 2011, Amazon S3 has also supported AWS Identity and Access Management (IAM) policies for managing access to S3 buckets, and recommends using […]

Maximizing price performance for big data workloads using Amazon EBS

Since the emergence of big data over a decade ago, Hadoop ­– an open-source framework that is used to efficiently store and process large datasets – has been crucial in storing, analyzing, and reducing that data to provide value for enterprises. Hadoop lets you store structured, partially structured, or unstructured data of any kind across […]

Simplify and scale access management to shared datasets with cross-account Amazon S3 Access Points

In today’s interconnected and data centric world, businesses must have access to the right data for data-driven decision-making, ultimately driving better business results. Collecting all the relevant data takes time and capital as it requires setting up data ingestion pipelines, hiring analysts to validate and interpret the data, and incorporating data insights that influence important […]

Isima.io optimizes price performance for OLAP workloads using Amazon EBS

Isima.io, a unified analytics startup founded in 2016, aims to accelerate analytics outcomes for organizations. Isimia.io does this by combining multiple data management disciplines – including Enterprise Service Bus (ESB), Extract-Transform-Load (ETL), Enterprise-Data-Warehouse (EDW), and Business Intelligence (BI) – into one hyper-converged system. IT teams can only win by building differentiated, agile data apps. The […]

Simplify archiving Amazon EBS Snapshots and monitor progress using a live Amazon CloudWatch dashboard

Data protection is top of mind for our customers, and having a data backup strategy is critical to ensure compliance, disaster recovery readiness, and business continuity. As customers experience exponential business growth, their data storage needs grow as well, and data retention can become very costly. In order to meet compliance requirements for data retention […]

Amazon S3 featured image - new

Run queries up to 9x faster using Trino with Amazon S3 Select on Amazon EMR

Customers building data lakes continue to innovate in the ways that they store and access their data. For these customers, performance is critical, particularly when they are accessing large amounts of data. For example, data scientists, data analysts, and data engineers running queries from open source frameworks like Trino want to accelerate access to their […]

AWS DataSync Featured Image 2020

How TMAP Mobility transferred 2.4 PB of Hadoop data using AWS DataSync

Launched in 2002, TMAP Mobility is Korea’s leading mobility platform, with 20 million registered users and 14 million monthly active users. TMAP provides navigation services based on a wide range of real-time traffic information and data. Previously, the Data Intelligence group at TMAP Mobility operated a mobility-data platform based on a Hadoop Distributed File System […]

Amazon S3 featured image - new

Using presigned URLs to identify per-requester usage of Amazon S3

Many software-as-a-service (SaaS) product offerings have a pay-as-you-go pricing model, charging customers only for the resources consumed. However, a pay-as-you-go pricing is only viable when you can accurately track each customer’s use of resources, such as compute capacity, storage, and networking bandwidth. Without this data, SaaS providers do not have visibility into resource consumption of […]

Amazon S3 Glacier Storage Classes

Restore data from Amazon S3 Glacier storage classes starting with partial object keys

When managing data storage, it is important to optimize for cost by storing data in the most cost-effective manner based on how often data is used or accessed. For many enterprises, this means using some form of cold storage or archiving for data that is less frequently accessed or used while keeping more frequently used […]