AWS Database Blog

Category: Analytics

Run full text search queries on Amazon DocumentDB (with MongoDB compatibility) data with Amazon OpenSearch Service

In this post, we show you how to integrate Amazon DocumentDB with Amazon ES so you can run full text search queries over your Amazon DocumentDB data. Specifically, we show you how to use an AWS Lambda function to stream events from your Amazon DocumentDB cluster’s change stream to an Amazon ES domain so you can run full text search queries on the data.

Performing analytics on Amazon Managed Blockchain

Managed Blockchain follows an event-driven architecture. We can open up a wide range of analytic approaches by streaming events to Amazon Kinesis. For instance, we could analyze events in near-real time with Kinesis Data Analytics, perform petabyte scale data warehousing with Amazon RedShift, or use the Hadoop ecosystem with Amazon EMR. This allows us to use the right approach for every blockchain analytics use case.
In this post, we show you one approach that uses Amazon Kinesis Data Firehose to capture, monitor, and aggregate events into a dataset, and analyze it with Amazon Athena using standard SQL.

Accelerating Nylas’s feature development with AWS Data Lab

This is a guest post by David Ting, VP of Engineering at Nylas. In their own words, Nylas is a pioneer and leading provider of universal communications APIs that allow developers to quickly connect their applications to every email, calendar, or contacts provider in the world. Over 26,000 developers around the globe use the Nylas […]

How caresyntax uses managed database services for better surgical outcomes

This is a guest post from Ken Wu, Chief Technology Officer, and Steve Gordon, Director of Engineering at caresyntax. caresyntax provides the needed tools to make surgery smarter and safer. Our solutions use IoT, analytics, and AI technologies to automate clinical and operational decision support for surgical teams and support all outcome contributors. We help […]

Change data capture from Neo4j to Amazon Neptune using Amazon Managed Streaming for Apache Kafka

After you perform a point-in-time data migration from Neo4j to Amazon Neptune, you may want to capture and replicate ongoing updates in real time. For more information about automating point-in-time graph data migration from Neo4j to Neptune, see Migrating a Neo4j graph database to Amazon Neptune with a fully automated utility. This post walks you […]

Building a customer 360 knowledge repository with Amazon Neptune and Amazon Redshift

Organizations build and deploy large-scale data platforms like data lakes, data warehouses, and lakehouses to capture and analyze a holistic view of their customer’s journey. The objective of such a data platform is to understand customer behavior patterns that influence satisfaction and drive more engagement. Applications today capture each point of contact with a customer, […]

Building data lakes and implementing data retention policies with Amazon RDS snapshot export to Amazon S3

Amazon Relational Database Service (RDS) helps you easily create, operate, and scale a relational database in the cloud. In January 2020, AWS announced the ability to export snapshots from Amazon RDS for MySQL, Amazon RDS for PostgreSQL, Amazon RDS for MariaDB, Amazon Aurora PostgreSQL, and Amazon Aurora MySQL into Amazon S3 in Apache Parquet format. […]

Backfilling an Amazon DynamoDB Time to Live (TTL) attribute with Amazon EMR

If you have complex data types such as maps and lists in your Amazon DynamoDB data, refer to Part 2 of this series. Bulk updates to a database can be disruptive and potentially cause downtime, performance impacts to your business processes, or overprovisioning of compute and storage resources. When performing bulk updates, you want to […]

Reducing cost for small Amazon Elasticsearch Service domains

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. When you deploy your Amazon Elasticsearch Service (Amazon ES) domain to support a production workload, you must choose the type and number of data instances to use, the number of Availability Zones, and whether to use dedicated master instances or […]

How Zendesk tripled performance by moving a legacy system onto Amazon Aurora and Amazon Redshift

This is a guest post by James Byrne, Engineering Leader at Zendesk, focusing on data pipeline development and operations for the Zendesk Explore analytics product, and Giedrius Praspaliauskas, AWS Solutions Architect. Zendesk is a CRM company that builds support, sales, and customer engagement software designed to foster better customer relationships. From large enterprises to startups, […]