AWS Big Data Blog
Category: Amazon OpenSearch Service
Amazon OpenSearch Service H1 2023 in review
Since its release in January 2021, the OpenSearch project has released 14 versions through June 2023. Amazon OpenSearch Service supports the latest versions of OpenSearch up to version 2.7. OpenSearch Service provides two configuration options to deploy and operate OpenSearch at scale in the cloud. With OpenSearch Service managed domains, you specify a hardware configuration […]
Optimizing Amazon OpenSearch Service performance: Fine-tuning shard size with Amazon CloudWatch storage and shard skew health
In this post, we explore how to deploy Amazon CloudWatch metrics using an AWS CloudFormation template to monitor an OpenSearch Service domain’s storage and shard skew, as well as shard sizes. This solution uses an AWS Lambda function to extract storage and shard distribution metadata from your OpenSearch Service domain, calculates the level of skew and shard sizes, and then pushes this information to CloudWatch metrics so that you can easily monitor, alert, and respond. This information will help you to meet the recommended settings for read and write throughput, performance, and fault tolerance.
Try semantic search with the Amazon OpenSearch Service vector engine
Amazon OpenSearch Service has long supported both lexical and vector search, since the introduction of its kNN plugin in 2020. With recent developments in generative AI, including AWS’s launch of Amazon Bedrock earlier in 2023, you can now use Amazon Bedrock-hosted models in conjunction with the vector database capabilities of OpenSearch Service, allowing you to implement semantic search, retrieval augmented generation (RAG), recommendation engines, and rich media search based on high-quality vector search. The recent launch of the vector engine for Amazon OpenSearch Serverless makes it even easier to deploy such solutions.
Amazon OpenSearch Serverless expands support for larger workloads and collections
We recently announced new enhancements to Amazon OpenSearch Serverless that can scan and search source data sizes of up to 6 TB. At launch, OpenSearch Serverless supported searching one or more indexes within a collection, with the total combined size of up to 1 TB. With the support for 6 TB source data, you can now scale up your log analytics, machine learning applications, and ecommerce data more effectively. With OpenSearch Serverless, you can enjoy the benefits of these expanded limits without having to worry about sizing, monitoring your usage, or manually scaling an OpenSearch domain.
Configure SAML federation for Amazon OpenSearch Serverless with Okta
Modern applications apply security controls across many systems and their subsystems. Keeping all of these systems in sync would be a major undertaking if you tried to implement it separately. Centralized identity management is the way to maintain a single identity provider (IdP) that can authenticate actors and manage and distribute their rights. OpenSearch is […]
How FIS ingests and searches vector data for quick ticket resolution with Amazon OpenSearch Service
This post was co-written by Sheel Saket, Senior Data Science Manager at FIS, and Rupesh Tiwari, Senior Architect at Amazon Web Services. Do you ever find yourself grappling with multiple defect logging mechanisms, scattered project management tools, and fragmented software development platforms? Have you experienced the frustration of lacking a unified view, hindering your ability […]
Introducing the vector engine for Amazon OpenSearch Serverless, now in preview
We are pleased to announce the preview release of the vector engine for Amazon OpenSearch Serverless. The vector engine provides a simple, scalable, and high-performing similarity search capability in Amazon OpenSearch Serverless that makes it easy for you to build modern machine learning (ML) augmented search experiences and generative artificial intelligence (AI) applications without having […]
Alcion supports their multi-tenant platform with Amazon OpenSearch Serverless
This is a guest blog post co-written with Zack Rossman from Alcion. Alcion, a security-first, AI-driven backup-as-a-service (BaaS) platform, helps Microsoft 365 administrators quickly and intuitively protect data from cyber threats and accidental data loss. In the event of data loss, Alcion customers need to search metadata for the backed-up items (files, emails, contacts, events, […]
Build a serverless log analytics pipeline using Amazon OpenSearch Ingestion with managed Amazon OpenSearch Service
In this post, we show how to build a log ingestion pipeline using the new Amazon OpenSearch Ingestion, a fully managed data collector that delivers real-time log and trace data to Amazon OpenSearch Service domains. OpenSearch Ingestion is powered by the open-source data collector Data Prepper. Data Prepper is part of the open-source OpenSearch project. […]
Access Amazon OpenSearch Serverless collections using a VPC endpoint
Amazon OpenSearch Serverless helps you index, analyze, and search your logs and data using OpenSearch APIs and dashboards. The OpenSearch Serverless collection is a group of indexes. API and dashboard clients can access the collections from public networks or one or more VPCs. For VPC access to collections and dashboards, you can create VPC endpoints. […]