AWS Big Data Blog
Category: Analytics
Build conversational AI search with Amazon OpenSearch Service
Amazon OpenSearch Service is a versatile search and analytics tool. In this post, we explore conversational search, its architecture, and various ways to implement it.
Enhance stability with dedicated cluster manager nodes using Amazon OpenSearch Service
In this post, we show how to enhance the stability of your OpenSearch Service domain with dedicated cluster manager nodes and how using these in deployment enhances your cluster’s stability and reliability.
Kaltura reduces observability operational costs by 60% with Amazon OpenSearch Service
In this post, we share how Kaltura transformed its observability strategy and technological stack by migrating from a software as a service (SaaS) logging solution to Amazon OpenSearch Service—achieving higher log retention, a 60% reduction in cost, and a centralized platform that empowers multiple teams with real-time insights.
Introducing GenAI-powered business description recommendations for custom assets in Amazon SageMaker Catalog
Amazon SageMaker Catalog now supports generative AI-powered recommendations for business descriptions, including table summaries, use cases, and column-level descriptions for custom structured assets registered programmatically. In this post, we demonstrate how to generate AI recommendations for business descriptions for custom structured assets in SageMaker Catalog.
Amazon Redshift Python user-defined functions will reach end of support after June 30, 2026
The Amazon Redshift integration with AWS Lambda provides the capability to create Amazon Redshift Lambda user-defined functions (UDFs). Because Lambda UDFs provide these significant advantages in integration, flexibility, scalability, and security, we will be ending support for Python UDFs in Amazon Redshift. In this post, we walk you through how to migrate your existing Python UDFs to Lambda UDFs, set up monitoring and cost evaluations, and review key considerations for a smooth transition.
Enforce table level access control on data lake tables using AWS Glue 5.0 with AWS Lake Formation
In this post, we show you how to enforce FTA control on AWS Glue 5.0 through Lake Formation permissions.
Building serverless event streaming applications with Amazon MSK and AWS Lambda
In this post, we describe how you can simplify your event-driven application architecture using AWS Lambda with Amazon MSK. We demonstrate how to configure Lambda as a consumer for Kafka topics, including a cross-account setup and how to optimize price and performance for these applications.
Enhance data ingestion performance in Amazon Redshift with concurrent inserts
Amazon Redshift employs columnar storage for database tables, reducing overall disk I/O requirements. This storage method significantly improves analytic query performance by minimizing data read during queries. This post showcases the key improvements in Amazon Redshift concurrent data ingestion operations.
Introducing AWS Glue Data Catalog usage metrics for API usage
We’re excited to announce AWS Glue Data Catalog usage metrics. The usage metrics is a new feature that provides native integration with Amazon CloudWatch. In this post, we demonstrate how to access these metrics, provide a step-by-step walkthrough, and set up meaningful alarms.
Amazon OpenSearch Service 101: Create your first search application with OpenSearch
In this post, we walk you through a search application building process using Amazon OpenSearch Service. Whether you’re a developer new to search or looking to understand OpenSearch fundamentals, this hands-on post shows you how to build a search application from scratch—starting with the initial setup; diving into core components such as indexing, querying, result presentation; and culminating in the execution of your first search query.