AWS Database Blog

Category: Analytics

Export and analyze Amazon DynamoDB data in an Amazon S3 data lake in Apache Parquet format

January 2023: Please refer to Accelerate Amazon DynamoDB data access in AWS Glue jobs using the new AWS Glue DynamoDB Export connector  for more recent updates on using Amazon Glue to extract data from Amazon DynamoDB. Amazon DynamoDB is a key-value and document database that delivers single-digit millisecond performance at any scale. It’s a fully […]

Creating Amazon Timestream interpolated views using Amazon Kinesis Data Analytics for Apache Flink

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. Many organizations have accelerated their adoption of stream data processing technologies in an effort to more quickly derive actionable insights from their data. Frequently, it is required […]

The following diagram illustrates this architecture.

Cross-account replication with Amazon DynamoDB

Update: For loading data into new DynamoDB tables, use the Import from S3 feature (announced on August 2022). Hundreds of thousands of customers use Amazon DynamoDB for mission-critical workloads. In some situations, you may want to migrate your DynamoDB tables into a different AWS account, for example, in the eventuality of a company being acquired […]

Streaming data to Amazon Managed Streaming for Apache Kafka using AWS DMS

AWS Database Migration Service (DMS) announced support of Amazon Managed Streaming for Apache Kafka (Amazon MSK) and self-managed Apache Kafka clusters as target. With AWS DMS you can replicate ongoing changes from any DMS supported sources such as Amazon Aurora (MySQL and PostgreSQL-compatible), Oracle, and SQL Server to Amazon Managed Streaming for Apache Kafka (Amazon MSK) and self-managed Apache Kafka clusters.
In this post, we use an e-commerce use case and set up the entire pipeline with the order data being persisted in an Aurora MySQL database. We use AWS DMS to load and replicate this data to Amazon MSK. We then use the data to generate a live graph on our dashboard application.

Run full text search queries on Amazon DocumentDB (with MongoDB compatibility) data with Amazon OpenSearch Service

In this post, we show you how to integrate Amazon DocumentDB with Amazon ES so you can run full text search queries over your Amazon DocumentDB data. Specifically, we show you how to use an AWS Lambda function to stream events from your Amazon DocumentDB cluster’s change stream to an Amazon ES domain so you can run full text search queries on the data.

Performing analytics on Amazon Managed Blockchain

Managed Blockchain follows an event-driven architecture. We can open up a wide range of analytic approaches by streaming events to Amazon Kinesis. For instance, we could analyze events in near-real time with Kinesis Data Analytics, perform petabyte scale data warehousing with Amazon RedShift, or use the Hadoop ecosystem with Amazon EMR. This allows us to use the right approach for every blockchain analytics use case.
In this post, we show you one approach that uses Amazon Kinesis Data Firehose to capture, monitor, and aggregate events into a dataset, and analyze it with Amazon Athena using standard SQL.

Accelerating Nylas’s feature development with AWS Data Lab

This is a guest post by David Ting, VP of Engineering at Nylas. In their own words, Nylas is a pioneer and leading provider of universal communications APIs that allow developers to quickly connect their applications to every email, calendar, or contacts provider in the world. Over 26,000 developers around the globe use the Nylas […]

How caresyntax uses managed database services for better surgical outcomes

This is a guest post from Ken Wu, Chief Technology Officer, and Steve Gordon, Director of Engineering at caresyntax. caresyntax provides the needed tools to make surgery smarter and safer. Our solutions use IoT, analytics, and AI technologies to automate clinical and operational decision support for surgical teams and support all outcome contributors. We help […]

Change data capture from Neo4j to Amazon Neptune using Amazon Managed Streaming for Apache Kafka

After you perform a point-in-time data migration from Neo4j to Amazon Neptune, you may want to capture and replicate ongoing updates in real time. For more information about automating point-in-time graph data migration from Neo4j to Neptune, see Migrating a Neo4j graph database to Amazon Neptune with a fully automated utility. This post walks you […]

Building a customer 360 knowledge repository with Amazon Neptune and Amazon Redshift

Organizations build and deploy large-scale data platforms like data lakes, data warehouses, and lakehouses to capture and analyze a holistic view of their customer’s journey. The objective of such a data platform is to understand customer behavior patterns that influence satisfaction and drive more engagement. Applications today capture each point of contact with a customer, […]