AWS Storage Blog
Building persistent memory for multi-agent AI systems with Amazon S3 Vectors
The most capable multi-agent AI systems share a common trait: they give agents the right context at the right time. When agents lack access to shared history, including what other agents discovered, what tasks are already complete, and what decisions were made in previous sessions, they might duplicate work, contradict each other, and burn through […]
Building self-managed RAG applications with Amazon EKS and Amazon S3 Vectors
Retrieval-Augmented Generation (RAG) is a technique that optimizes large language model (LLM) outputs by referencing authoritative knowledge bases outside of the model’s training data before generating responses. This addresses common limitations of traditional LLMs, such as outdated knowledge, hallucinated facts, and misinterpreted terminology. Organizations can implement RAG to enhance their generative AI applications with current, […]
Architecting scalable checkpoint storage for large-scale ML training on AWS
The exponential growth in size and complexity of foundation models (FMs) has created unprecedented infrastructure demands across compute, networking, and storage resources. Storage systems, in particular, face intense requirements for throughput, latency, and capacity. In machine learning (ML) model training, these storage demands are particularly evident in checkpointing—a critical reliability mechanism that periodically saves and […]
