AWS Database Blog

Category: Analytics

Export and analyze Amazon QLDB journal data using AWS Glue and Amazon Athena

Amazon Quantum Ledger Database (Amazon QLDB) is a fully managed ledger database that maintains a complete, immutable record of every change committed to the database. As transactions are committed to the database, they are appended to a transaction log called a journal and are cryptographically hash-chained to the previous transaction. Once committed, the record of […]

Stream data with Amazon DocumentDB and Amazon MSK using a Kafka connector

A common trend in modern application development and data processing is the use of Apache Kafka as a standard delivery mechanism for your data pipeline and fan-out approach. Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully managed, highly available, and secure service that makes it simple for developers and DevOps managers to […]

Access Bitcoin and Ethereum open datasets for cross-chain analytics

In this post, we share an open-source solution for running cross-chain analytics on public blockchain data along with public datasets for Bitcoin and Ethereum available through AWS Open Data. These datasets are still experimental and are not recommended for production workloads. You can find the open-source project on GitHub here and the public blockchain datasets […]

Modernize legacy databases using event sourcing and CQRS with AWS DMS

When moving from monoliths to microservices, you often need to propagate the same data from the monolith into multiple downstream data stores. These include purpose-built databases serving microservices as part of a decomposition project, Amazon Simple Storage Service (Amazon S3) for hydrating a data lake, or as part of a long-running command query responsibility segregation […]

Migrate from Azure Cosmos DB to Amazon DynamoDB using AWS Glue

To take advantage of the performance, security, and scale of Amazon DynamoDB, customers want to migrate their data from their existing NoSQL databases in a way that is cost-optimized and performant. In this post, we show you how to migrate data from Azure Cosmos DB to Amazon DynamoDB through an offline migration approach using AWS […]

Archive data from Amazon DynamoDB to Amazon S3 using TTL and Amazon Kinesis integration

In this post, we share how you can use Amazon Kinesis integration and the Amazon DynamoDB Time to Live (TTL) feature to design data archiving. Archiving old data helps reduce costs and meet regulatory requirements governing data retention or deletion policies. Amazon Kinesis Data Streams for DynamoDB captures item-level modifications in a DynamoDB table and […]

Combine Amazon Neptune and Amazon OpenSearch Service for geospatial queries

Many AWS customers are looking to solve their business problems by storing and integrating data across a combination of purpose-built databases. The reason for that is purpose-built databases provide innovative ways to build data access patterns that would be challenging or inefficient to solve otherwise. For example, we can model highly connected geospatial data as […]

Store and stream sports data feeds using Amazon DynamoDB and Amazon Kinesis Data Streams

Online bookmakers are innovating to offer their clients continuously updated sports data feeds that allow betting throughout the duration of matches. In this post, we walk through a solution to ingest, store, and stream sports data feeds in near real-time using Amazon API Gateway, Amazon DynamoDB, and Amazon Kinesis Data Streams. In betting, odds represent […]

Architecture Diagram

Build interactive graph data analytics and visualizations using Amazon Neptune, Amazon Athena Federated Query, and Amazon QuickSight

Customers have asked for a way to interact with graph datasets in Amazon Neptune using business intelligence (BI) tools such as Amazon QuickSight. Although some BI tools offer generic HTTP connectors that allow you to define a set of REST API calls to extract data from REST endpoints, you have to predefine either Gremlin or […]

Build a fault-tolerant, serverless data aggregation pipeline with exactly-once processing

The business problem of real-time data aggregation is faced by customers in various industries like manufacturing, retail, gaming, utilities, and financial services. In a previous post, we discussed an example from the banking industry: real-time trade risk aggregation. Typically, financial institutions associate every trade that is performed on the trading floor with a risk value […]