AWS Storage Blog
Category: Intermediate (200)
Identifying potential duplicate objects in Amazon S3
Update 6/6/2025: Added “Important considerations” section that calls out reliance on MD5 and updated title from “Managing duplicate…” to “Identifying potential…” for accuracy. When managing a large volume of data in a storage system, it is common for data duplication to happen. Data duplication in data management refers to the presence of multiple copies of […]
Automatic monitoring of actions taken on objects in Amazon S3
Administrators may need to monitor and audit actions, like uploads, updates, and deletes, taken on files and other data to comply with regulations or company policies. A scalable and reliable method of tracking and saving actions taken on files can reduce manual work and operational overhead while helping to ensure compliance. An event-based fanout architectures […]
Amazon S3 Express One Zone delivers cost and performance gains for ChaosSearch customers
ChaosSearch is an Amazon S3-native database built on a serverless, stateless compute architecture within AWS that delivers live search, SQL, and Generative AI analytics. At ChaosSearch, the speed and performance of our architecture is important to us and our customers because time to results is the difference between success and failure, and we rely on […]
Akridata accelerates processing of unstructured data with Amazon S3 Express One Zone
Deep learning processes often need to read full datasets, which are usually hundreds of gigabytes in size, before they can perform intelligent data processing. High data retrieval speed and low latency from storage are crucial for enterprises running these performance-critical workloads. Akridata, an AWS independent software vendor (ISV) partner, helps make artificial intelligence (AI)-assisted unstructured-data […]
lakeFS and Amazon S3 Express One Zone: Highly performant data version control for ML/AI
Machine learning presents a number of new challenges to data teams, calling for technology solutions that can support training and fine-tuning performance-critical workloads with high performance. Data version control is one of the facets of high-performing ML pipelines, as it allows efficient experimentation and full ML pipeline reproducibility at scale. lakeFS by Treeverse, an AWS […]
ClickHouse Cloud & Amazon S3 Express One Zone: Making a blazing fast analytical database even faster
ClickHouse is a columnar database management system (DBMS) designed for blazing-fast real-time analytics. It was built to address the needs of interactive analytical applications requiring up-to-the-second analytics. To do that, it must support real-time data ingestion at the rate of hundreds of millions of events per second and run complex analytical queries, such as filtering, […]
Streamline data sharing and access control with Informatica Cloud Data Marketplace and Amazon S3 Access Grants
Organizations are modernizing their data lakes on Amazon Simple Storage Service (Amazon S3) to handle the ever-growing data volume and speed while meeting the demands of analytics, machine learning (ML), artificial intelligence (AI), and generative AI applications. To enable a data-driven culture and remain innovative, the data platform must allow for data-centric collaboration across business […]
How to enforce Amazon S3 Access Grants with Immuta
Amazon Simple Storage Service (Amazon S3) is the most popular object storage platform for modern data lakes. Organizations today evolved to adopt a lake house architecture that combines the scalability and cost effectiveness of data lakes with the performance and ease-of-use of data warehouses. Likewise, Amazon S3 plays an increasingly important role as the foundational […]
Simplify querying your archive data in Amazon S3 with Amazon Athena
Today, customers increasingly choose to store data for longer because they recognize its future value potential. Storing data longer, coupled with exponential data growth, has led to customers placing a greater emphasis on storage cost optimization and using cost-effective storage classes. However, a modern data archiving strategy not only calls for optimizing storage costs, but […]
Migrating Wasabi Object Storage to Amazon S3 using AWS DataSync
Update (5/29/2025): On May 29, 2025, AWS DataSync launched Enhanced mode support for cross-cloud transfers. Enhanced mode simplifies data transfers between AWS and other clouds by removing the need for a DataSync agent. It also provides higher performance and scalability when compared to Basic mode. For more details, see the What’s New announcement or review the documentation for guidance […]



