AWS Database Blog

Category: AWS Glue

Amazon DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse – Part 2

Amazon DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse allows you to run analytics workloads on your DynamoDB data without having to set up and manage extract, transform, and load (ETL) pipelines. In this post we cover setting up Amazon SageMaker Unified Studio, followed by running data analysis to showcase its capabilities. We illustrate our solution walkthrough with an example of a credit card company that wants to analyze its customer behavior and spending trends.

Amazon DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse – Part 1

Amazon DynamoDB zero-ETL integration with Amazon SageMaker Lakehouse allows you to run analytics workloads on your DynamoDB data without having to set up and manage extract, transform, and load (ETL) pipelines. In this two-part series, we first walk through the prerequisites and initial setup for the zero-ETL integration. In Part 2, we cover setting up Amazon SageMaker Unified Studio, followed by running data analysis to showcase its capabilities. We illustrate our solution walkthrough with an example of a credit card company that wants to analyze its customer behavior and spending trends.

Gather organization-wide Amazon RDS orphan snapshot insights using AWS Step Functions and Amazon QuickSight

In this post, we walk you through a solution to aggregate RDS orphan snapshots across accounts and AWS Regions, enabling automation and organization-wide visibility to optimize cloud spend based on data-driven insights. Cross-region copied snapshots, Aurora cluster copied snapshots and shared snapshots are out of scope for this solution. The solution uses AWS Step Functions orchestration together with AWS Lambda functions to generate orphan snapshot metadata across your organization. Generated metadata information is stored in Amazon Simple Storage Service (Amazon S3) and transformed into an Amazon Athena table by AWS Glue. Amazon QuickSight uses the Athena table to generate orphan snapshot insights.

Query RDF graphs using SPARQL and property graphs using Gremlin with the Amazon Athena Neptune connector

To query a Neptune database in Athena, you can use the Amazon Athena Neptune connector, an AWS Lambda function that connects to the Neptune cluster and queries the graph on behalf of Athena. In this post, we provide a step-by-step implementation guide to integrate the new version of the Athena Neptune connector and query a Neptune cluster using Gremlin and SPARQL queries.

Create an AWS Glue Data Catalog with AWS DMS

Businesses need near realtime access to the latest data and metadata available from many silos to perform analytics. AWS Glue is a serverless data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning (ML) and application development. AWS Glue Data Catalog is a centralized […]

Visualize Ethereum ERC20 token data using Amazon Managed Blockchain Query and Amazon QuickSight

Businesses such as Paxos that issue stablecoin USD tokens want to find a way to identify common token metrics such as top holders, daily active users, daily volume, total number of holders, latest transfers, top Decentralized Finance (DeFi) protocols the tokens have been used on, and more. With Amazon Managed Blockchain (AMB) Query and Amazon […]

Archival solutions for Oracle database workloads in AWS: Part 1

This is a two-part series. In this post, we explain three archival solutions that allow you to archive Oracle data into Amazon Simple Storage Service (Amazon S3). In Part 2 of this series, we explain three archival solutions using native Oracle products and utilities. All of these options allow you to join current Oracle data with archived data.

Migrate an Informix database to Amazon Aurora PostgreSQL using CData Connect Cloud from within AWS Glue Studio

Amazon Aurora PostgreSQL-Compatible Edition is a fully managed PostgreSQL-compatible database engine running in AWS and is a drop-in replacement for PostgreSQL. Aurora PostgreSQL is cost-effective to set up, operate, and scale, and can be deployed for new or existing applications. Informix is a relational database management system from IBM and supports OLTP and other workloads. […]

Automate the migration of Microsoft SSIS packages to AWS Glue with AWS SCT

When you migrate Microsoft SQL Server workloads to AWS, you might want to automate migration and minimize changes to existing applications, but still use a cost-effective option without commercial licenses and reduce operational overhead. For example, SQL Server workloads often use SQL Server Integration Services (SSIS) to extract, transform, and load (ETL) data. In this […]