AWS Database Blog

Category: Analytics

Streaming data to Amazon Managed Streaming for Apache Kafka using AWS DMS

AWS Database Migration Service (DMS) announced support of Amazon Managed Streaming for Apache Kafka (Amazon MSK) and self-managed Apache Kafka clusters as target. With AWS DMS you can replicate ongoing changes from any DMS supported sources such as Amazon Aurora (MySQL and PostgreSQL-compatible), Oracle, and SQL Server to Amazon Managed Streaming for Apache Kafka (Amazon MSK) and self-managed Apache Kafka clusters.
In this post, we use an e-commerce use case and set up the entire pipeline with the order data being persisted in an Aurora MySQL database. We use AWS DMS to load and replicate this data to Amazon MSK. We then use the data to generate a live graph on our dashboard application.

Read More

Run full text search queries on Amazon DocumentDB (with MongoDB compatibility) data with Amazon Elasticsearch Service

In this post, we show you how to integrate Amazon DocumentDB with Amazon ES so you can run full text search queries over your Amazon DocumentDB data. Specifically, we show you how to use an AWS Lambda function to stream events from your Amazon DocumentDB cluster’s change stream to an Amazon ES domain so you can run full text search queries on the data.

Read More

Performing analytics on Amazon Managed Blockchain

Managed Blockchain follows an event-driven architecture. We can open up a wide range of analytic approaches by streaming events to Amazon Kinesis. For instance, we could analyze events in near-real time with Kinesis Data Analytics, perform petabyte scale data warehousing with Amazon RedShift, or use the Hadoop ecosystem with Amazon EMR. This allows us to use the right approach for every blockchain analytics use case.
In this post, we show you one approach that uses Amazon Kinesis Data Firehose to capture, monitor, and aggregate events into a dataset, and analyze it with Amazon Athena using standard SQL.

Read More

Accelerating Nylas’s feature development with AWS Data Lab

This is a guest post by David Ting, VP of Engineering at Nylas. In their own words, Nylas is a pioneer and leading provider of universal communications APIs that allow developers to quickly connect their applications to every email, calendar, or contacts provider in the world. Over 26,000 developers around the globe use the Nylas […]

Read More

How caresyntax uses managed database services for better surgical outcomes

This is a guest post from Ken Wu, Chief Technology Officer, and Steve Gordon, Director of Engineering at caresyntax. caresyntax provides the needed tools to make surgery smarter and safer. Our solutions use IoT, analytics, and AI technologies to automate clinical and operational decision support for surgical teams and support all outcome contributors. We help […]

Read More

Change data capture from Neo4j to Amazon Neptune using Amazon Managed Streaming for Apache Kafka

After you perform a point-in-time data migration from Neo4j to Amazon Neptune, you may want to capture and replicate ongoing updates in real time. For more information about automating point-in-time graph data migration from Neo4j to Neptune, see Migrating a Neo4j graph database to Amazon Neptune with a fully automated utility. This post walks you […]

Read More

Building a customer 360 knowledge repository with Amazon Neptune and Amazon Redshift

Organizations build and deploy large-scale data platforms like data lakes, data warehouses, and lakehouses to capture and analyze a holistic view of their customer’s journey. The objective of such a data platform is to understand customer behavior patterns that influence satisfaction and drive more engagement. Applications today capture each point of contact with a customer, […]

Read More

Building data lakes and implementing data retention policies with Amazon RDS snapshot export to Amazon S3

Amazon Relational Database Service (RDS) helps you easily create, operate, and scale a relational database in the cloud. In January 2020, AWS announced the ability to export snapshots from Amazon RDS for MySQL, Amazon RDS for PostgreSQL, Amazon RDS for MariaDB, Amazon Aurora PostgreSQL, and Amazon Aurora MySQL into Amazon S3 in Apache Parquet format. […]

Read More

Backfilling an Amazon DynamoDB Time to Live (TTL) attribute with Amazon EMR

Bulk updates to a database can be disruptive and potentially cause downtime, performance impacts to your business processes, or overprovisioning of compute and storage resources. When performing bulk updates, you want to choose a process that runs quickly, enables you to operate your business uninterrupted, and minimizes your cost. Let’s take a look at how […]

Read More

Reducing cost for small Amazon Elasticsearch Service domains

When you deploy your Amazon Elasticsearch Service (Amazon ES) domain to support a production workload, you must choose the type and number of data instances to use, the number of Availability Zones, and whether to use dedicated master instances or not. To follow all the best practice recommendations, you must configure the following: Three dedicated […]

Read More