AWS Big Data Blog

Category: Intermediate (200)

Create an OpenSearch dashboard with Amazon OpenSearch Service

This post demonstrates how to harness OpenSearch Dashboards to analyze logs visually and interactively. With this solution, IT administrators, developers, and DevOps engineers can create custom dashboards to monitor system behavior, detect anomalies early, and troubleshoot issues faster through interactive charts and graphs.

Develop and deploy a generative AI application using Amazon SageMaker Unified Studio

In this post, we demonstrate how to use Amazon Bedrock Flows in SageMaker Unified Studio to build a sophisticated generative AI application for financial analysis and investment decision-making.

Build an analytics pipeline that is resilient to Avro schema changes using Amazon Athena

This post demonstrates how to build a solution by combining Amazon Simple Storage Service (Amazon S3) for data storage, AWS Glue Data Catalog for schema management, and Amazon Athena for one-time querying. We’ll focus specifically on handling Avro-formatted data in partitioned S3 buckets, where schemas can change frequently while providing consistent query capabilities across all data regardless of schema versions.

Integrating Amazon OpenSearch Ingestion with Amazon RDS and Amazon Aurora

We are happy to announce the general availability of the integration of Amazon OpenSearch Service with Amazon Relational Database Service (Amazon RDS) and Amazon Aurora. This new integration eliminates complex data pipelines and enables near real-time data synchronization between Amazon Aurora (including Amazon Aurora MySQL-Compatible Edition and Amazon Aurora PostgreSQL-Compatible Edition) and Amazon RDS databases (including Amazon RDS for MySQL and Amazon RDS for PostgreSQL), and Amazon OpenSearch Service, unlocking advanced search capabilities such as hybrid search, ranked results, and faceted search on transactional databases.

Orchestrate data processing jobs, querybooks, and notebooks using visual workflow experience in Amazon SageMaker

Today, we are excited to launch a new visual workflows builder in SageMaker Unified Studio. With the new visual workflow experience, you don’t need to code the Python DAGs manually. Instead, you can visually define the orchestration workflow in SageMaker Unified Studio, and the visual definition is automatically converted to a Python DAG definition that is supported in Airflow.This post demonstrates the new visual workflow experience in SageMaker Unified Studio.

Near real-time baggage operational insights for airlines using Amazon Kinesis Data Streams

This post explores a framework developed by IBM to modernize baggage analytics using AWS managed services like Amazon Kinesis Data Streams, DynamoDB Streams, and other AWS services within a serverless architecture. The solution enables near real-time baggage operational insights for airlines, delivering cost savings, enhanced scalability, and improved performance while providing better security and operational efficiency to meet evolving airline needs.

Overcome your Kafka Connect challenges with Amazon Data Firehose

We’re happy to announce a new feature in the Amazon Data Firehose integration with Amazon MSK. You can now specify the Firehose stream to either read from the earliest position on the Kafka topic or from a custom timestamp to begin reading from your MSK topic. In this post of this series, we focus on managed data delivery from Kafka to your data lake.