AWS Storage Blog

Category: Amazon Simple Storage Service (S3)

Building self-managed RAG applications with Amazon EKS and Amazon S3 Vectors

Retrieval-Augmented Generation (RAG) is a technique that optimizes large language model (LLM) outputs by referencing authoritative knowledge bases outside of the model’s training data before generating responses. This addresses common limitations of traditional LLMs, such as outdated knowledge, hallucinated facts, and misinterpreted terminology. Organizations can implement RAG to enhance their generative AI applications with current, […]

Amazon S3 Tables

How Zeta Global scales multi-tenant data ingestion with Amazon S3 Tables

Zeta Global is a data-driven marketing technology company that uses consumer insights to empower brands in customer acquisition, growth, and retention. At the core of its operations is the Zeta Marketing Platform, an advanced system that applies sophisticated AI and machine learning (ML) capabilities on proprietary data from over 245 million U.S. consumer profiles. This […]

Enforcing organization-wide Amazon S3 bucket-tagging policies

In today’s complex cloud environments, maintaining consistent resource tagging is a critical challenge faced by organizations of all sizes. Proper resource tagging is essential for cost allocation, security compliance, operational management, and maintaining governance at scale. However, enforcing tagging standards across distributed teams and numerous resources can be difficult, especially when dealing with rapid deployment […]

Amazon S3 Batch Operations featured image

Efficiently verify Amazon S3 data at scale with compute checksum operation

Organizations across industries must regularly verify the integrity of their stored datasets to protect valuable information, satisfy compliance requirements, and preserve trust. Media and entertainment customers validate assets to make sure that content remains intact, financial institutions run integrity checks to meet regulatory obligations, and research institutions confirm the reproducibility of scientific results. These verifications […]

Resilience by design: Building an effective ransomware recovery strategy

Ransomware events have become a board room priority for modern organizations. The data shows a clear trend: ransomware events have more than doubled since the pandemic began, with the financial services sector experiencing particularly high targeting rates. At AWS, our cross-field collaboration with global financial services customers, regulators, governing bodies and industry partners has resulted […]

Amazon S3 Express One Zone thumbnail

Simplify log rotation with Amazon S3 Express One Zone

Log rotation is a standard operational practice for maintaining system health and performance while managing storage costs effectively. This practice involves systematically archiving log files to prevent them from consuming excessive storage. When a log file reaches a certain size or age, it’s rotated—meaning the current file is archived with a new name and a […]

Mountpoint for Amazon S3 CSI driver v2: Accelerated performance and improved resource usage for Kubernetes workloads

Amazon S3 is the best place to build data lakes because of its durability, availability, scalability, and security. In 2023, we introduced Mountpoint for Amazon S3, an open source file client that allows Linux-based applications to access S3 objects through a file API. Shortly after, we took this one step further with the Mountpoint for […]

Amazon S3 Tables

Faster threat detection at scale: Real-time cybersecurity graph analytics with PuppyGraph and Amazon S3 Tables

Modern cybersecurity teams are facing unprecedented challenges in data analysis by the scale, complexity, and velocity of data. Cloud environments continuously generate massive amounts information in form of access logs, configuration changes, alerts, and telemetry. Traditional analysis methods of looking at these data points in isolation can’t effectively detect threats such as lateral movement and […]

Amazon S3 Batch Operations featured image

Copy objects between any Amazon S3 storage classes using S3 Batch Operations

When storing data, choosing the storage class that is best suited for your particular needs allows you to optimize your storage costs, performance, and object availability. However, over time, the access patterns for your objects can change, which means you may need to migrate your objects to a different storage class to continue optimize for […]

Amazon S3 Glacier Storage Classes

Cost-efficient backup archiving with Veeam Direct to Amazon S3 Glacier storage classes

If you work with data storage and data protection, then you’re aware of the “3-2-1 rule.” System administrators consider the 3-2-1 rule a best practice for backup and disaster recovery (DR), and it is recommended by US-CERT. The 3-2-1 rule states that you should have three copies of your data (your production data and two […]