AWS Big Data Blog

Category: Amazon OpenSearch Service

Use DeepSeek with Amazon OpenSearch Service vector database and Amazon SageMaker

OpenSearch Service provides rich capabilities for RAG use cases, as well as vector embedding-powered semantic search. You can use the flexible connector framework and search flow pipelines in OpenSearch to connect to models hosted by DeepSeek, Cohere, and OpenAI, as well as models hosted on Amazon Bedrock and SageMaker. In this post, we build a connection to DeepSeek’s text generation model, supporting a RAG workflow to generate text responses to user queries.

OpenSearch Vector Engine is now disk-optimized for low cost, accurate vector search

OpenSearch Vector Engine can now run vector search at a third of the cost on OpenSearch 2.17+ domains. You can now configure k-NN (vector) indexes to run on disk mode, optimizing it for memory-constrained environments, and enable low-cost, accurate vector search that responds in low hundreds of milliseconds. Disk mode provides an economical alternative to memory mode when you don’t need near single-digit latency. In this post, you’ll learn about the benefits of this new feature, the underlying mechanics, customer success stories, and getting started.

Generate vector embeddings for your data using AWS Lambda as a processor for Amazon OpenSearch Ingestion

In this post, we demonstrate how to use the OpenSearch Ingestion’s Lambda processor to generate embeddings for your source data and ingest them to an OpenSearch Serverless vector collection. This solution uses the flexibility of OpenSearch Ingestion pipelines with a Lambda processor to dynamically generate embeddings.

Juicebox recruits Amazon OpenSearch Service’s vector database for improved talent search

Juicebox is an AI-powered talent sourcing search engine, using advanced natural language models to help recruiters identify the best candidates from a vast dataset of over 800 million profiles. At the core of this functionality is Amazon OpenSearch Service, which provides the backbone for Juicebox’s powerful search infrastructure, enabling a seamless combination of traditional full-text search methods with modern, cutting-edge semantic search capabilities. In this post, we share how Juicebox uses OpenSearch Service for improved search.

Cost Optimized Vector Database: Introduction to Amazon OpenSearch Service quantization techniques

This blog post introduces a new disk-based vector search approach that allows efficient querying of vectors stored on disk without loading them entirely into memory. By implementing these quantization methods, organizations can achieve compression ratios of up to 64x, enabling cost-effective scaling of vector databases for large-scale AI and machine learning applications.

Use CI/CD best practices to automate Amazon OpenSearch Service cluster management operations

This post explores how to automate Amazon OpenSearch Service cluster management using CI/CD best practices. It presents two options: the Terraform OpenSearch provider and the Evolution library. The solution demonstrates how to use AWS CDK, Lambda, and CodeBuild to implement automated index template creation and management. By applying these techniques, organizations can improve the consistency, reliability, and efficiency of their OpenSearch operations.

Enhancing Search Relevancy with Cohere Rerank 3.5 and Amazon OpenSearch Service

In this blog post, we’ll dive into the various scenarios for how Cohere Rerank 3.5 improves search results for best matching 25 (BM25), a keyword-based algorithm that performs lexical search, in addition to semantic search. We will also cover how businesses can significantly improve user experience, increase engagement, and ultimately drive better search outcomes by implementing a reranking pipeline.

Intel Accelerators on Amazon OpenSearch Service improve price-performance on vector search by up to 51%

OpenSearch Service is a managed service for the OpenSearch search and analytics suite, which includes support for vector search. By running your OpenSearch 2.17+ domains on C/M/R 7i instances, you can achieve up to a 51% price-performance gain compared to the past R5 instances on OpenSearch Service. As we discuss in this post, this launch offers improvements to your infrastructure total cost of ownership (TCO) and savings.