AWS Storage Blog

Category: Analytics

Amazon S3 featured image 2023

Siemens builds Datalake2Go on AWS to analyze disparate data globally

Siemens is a technology company focused on industry, infrastructure, transport, and healthcare. From resource-efficient factories, resilient supply chains, and smart buildings and grids, to cleaner and more comfortable transportation and advanced healthcare, the company creates technology with purpose, adding real value for its customers. Siemens technology is everywhere, supporting the critical infrastructure and vital industries […]

Amazon S3 Object Lock

Maintaining object immutability by automatically extending Amazon S3 Object Lock retention periods

Protecting against accidental or malicious deletion is a key element of data protection. Immutability protects data in-place, preventing unintended changes or deletions. However, sometimes it isn’t clear for how long data should be made immutable. Users in this situation are looking for a solution that maintains short-term immutability, indefinitely. They want to make sure their […]

Amazon S3 featured image 2023

Understand Amazon S3 data transfer costs by classifying requests with Amazon Athena

Cost is top of mind for many enterprises, and building awareness of different cost contributors is the first step toward managing costs and improving efficiency. Costs for transferring data may segregate into common but low cost and less frequent but higher cost groups. Data about these two groups is mixed together, and separating them enables […]

Amazon S3 featured image 2023

Identifying potential duplicate objects in Amazon S3

Update 6/6/2025: Added “Important considerations” section that calls out reliance on MD5 and updated title from “Managing duplicate…” to “Identifying potential…” for accuracy. When managing a large volume of data in a storage system, it is common for data duplication to happen. Data duplication in data management refers to the presence of multiple copies of […]

Amazon S3 featured image 2023

Automatic monitoring of actions taken on objects in Amazon S3

Administrators may need to monitor and audit actions, like uploads, updates, and deletes, taken on files and other data to comply with regulations or company policies. A scalable and reliable method of tracking and saving actions taken on files can reduce manual work and operational overhead while helping to ensure compliance. An event-based fanout architectures […]

Amazon S3 Object Lambda

Automatically modify data you are querying with Amazon Athena using Amazon S3 Object Lambda

Enterprises may want to customize their data sets for different requesting applications. For example, if you run an e-commerce website, you may want to mask Personally Identifiable Information (PII) when querying your data for analytics. Although you can create and store multiple customized copies of your data, that can increase your storage cost. You can […]

Amazon S3 featured image - new

How to enforce Amazon S3 Access Grants with Immuta

Amazon Simple Storage Service (Amazon S3) is the most popular object storage platform for modern data lakes. Organizations today evolved to adopt a lake house architecture that combines the scalability and cost effectiveness of data lakes with the performance and ease-of-use of data warehouses. Likewise, Amazon S3 plays an increasingly important role as the foundational […]

Amazon S3 Archive Storage Classes

Simplify querying your archive data in Amazon S3 with Amazon Athena

Today, customers increasingly choose to store data for longer because they recognize its future value potential. Storing data longer, coupled with exponential data growth, has led to customers placing a greater emphasis on storage cost optimization and using cost-effective storage classes. However, a modern data archiving strategy not only calls for optimizing storage costs, but […]

Amazon S3 featured image - new

Getting visibility into storage usage in multi-tenant Amazon S3 buckets

SaaS providers with multi-tenant environments use cloud solutions to dynamically scale their workloads as customer demand increases. As their cloud footprint grows, having visibility into each end-customer’s storage consumption becomes important to distribute resources accordingly. An organization can use storage usage data per customer (tenant) to adjust its pricing model or better plan its budget. […]

Amazon S3 featured image - new

Consolidate and query Amazon S3 Inventory reports for Region-wide object-level visibility

Organizations around the world store billions of objects and files representing terabytes to petabytes of data. Data is often owned by different teams, departments, or business units, spanning multiple locations. As the amount of datastores, locations, and owners grow, you need a way to cost-effectively maintain visibility on important characteristics of your data, including based […]