AWS Database Blog

Category: Analytics

Perform fuzzy full-text search and semantic search on Amazon DocumentDB using Amazon OpenSearch Service

In this post, we show you how to integrate Amazon DocumentDB (with MongoDB compatibility) with Amazon OpenSearch Service using AWS Lambda integration and run full-text search, fuzzy search, and synonym search on an artificially generated reviews dataset. Amazon DocumentDB is a fast, scalable, highly durable, and fully managed database service for operating mission-critical MongoDB API-compatible […]

Data consolidation for analytical applications using logical replication for Amazon RDS Multi-AZ clusters

Amazon Relational Database Service (Amazon RDS) Multi-AZ deployments provide enhanced availability and durability for your RDS database instances. You can deploy highly available, durable PostgreSQL databases in three Availability Zones using Amazon RDS Multi-AZ DB cluster deployments with two readable standby DB instances. With a Multi-AZ DB cluster, applications gain automatic failovers in typically under […]

The role of vector datastores in generative AI applications

Generative AI has captured our imagination and is transforming industries with its ability to answer questions, write stories, create art, and even generate code. AWS customers are increasingly asking us how they can best take advantage of generative AI in their own businesses. Most have accumulated a wealth of domain-specific data (financial records, health records, […]

Stream data from Amazon DocumentDB to Amazon Kinesis Data Firehose using AWS Lambda

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. In this post, we discuss how to create the data pipelines from Amazon DocumentDB (with MongoDB compatibility) to Amazon Kinesis Data Firehose and publish changes to your destination store. Amazon DocumentDB (with […]

Migrate an Informix database to Amazon Aurora PostgreSQL using CData Connect Cloud from within AWS Glue Studio

Amazon Aurora PostgreSQL-Compatible Edition is a fully managed PostgreSQL-compatible database engine running in AWS and is a drop-in replacement for PostgreSQL. Aurora PostgreSQL is cost-effective to set up, operate, and scale, and can be deployed for new or existing applications. Informix is a relational database management system from IBM and supports OLTP and other workloads. […]

Stream data with Amazon DocumentDB, Amazon MSK Serverless, and Amazon MSK Connect

A common trend in modern application development and data processing is the use of Apache Kafka as a standard delivery mechanism for data pipeline and fan-out approach. Amazon Managed Streaming for Apache Kafka (Amazon MSK) is a fully-managed, highly available, and secure service that makes it simple for developers and DevOps managers to run applications […]

Automate the migration of Microsoft SSIS packages to AWS Glue with AWS SCT

When you migrate Microsoft SQL Server workloads to AWS, you might want to automate migration and minimize changes to existing applications, but still use a cost-effective option without commercial licenses and reduce operational overhead. For example, SQL Server workloads often use SQL Server Integration Services (SSIS) to extract, transform, and load (ETL) data. In this […]

Migrate data from Apache HBase to Amazon DynamoDB

Over the last few years, organizations have started adopting a cloud first strategy, and we are seeing enterprises migrate their mission-critical applications, along with their data platforms, to the cloud. Occasionally, organizations need guidance in selecting the right service and solution in the cloud, along with an approach to assist with the migration. In this […]

Joining historical data between Amazon Athena and Amazon RDS for PostgreSQL

While databases are used to store and retrieve data, there are situations where applications should archive or purge the data to reduce storage costs or improve performance. However, there are often business requirements where an application must query both active data and archived data simultaneously. Developers need a solution that lets them benefit from using […]

Security is time series: How VMware Carbon Black improves and scales security observability with Amazon Timestream

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. Amazon Timestream is a fast, serverless, and secure time series database and analytics service that can scale to process trillions of time series events per day. Organizations […]