AWS Database Blog

Category: Amazon Titan

Running pgvector in production on Amazon Aurora PostgreSQL

Running pgvector on Amazon Aurora PostgreSQL gives you a production-grade vector store on a database you already know, backed by the operational tooling, high availability, and scaling behaviour of Amazon Aurora. Production traffic does introduce a predictable set of operational considerations: query latency as the corpus grows, recall on filtered vector searches, memory headroom during index builds, and connection behaviour under load. This post is scoped to the database operations that keep the RAG retrieval layer healthy. In this post, we cover the operational practices that keep a pgvector workload healthy once you depend on it: choosing the right index and distance function, scaling with quantization and partitioning, managing Hierarchical Navigable Small World (HNSW) churn, sizing for memory-resident operation, and the observability signals that catch problems early.