AWS Big Data Blog

Category: Database

Amazon Redshift Spectrum Extends Data Warehousing Out to Exabytes—No Loading Required

When we first looked into the possibility of building a cloud-based data warehouse many years ago, we were struck by the fact that our customers were storing ever-increasing amounts of data, and yet only a small fraction of that data ever made it into a data warehouse or Hadoop system for analysis. We saw that […]

Read More

Analysis of Top-N DynamoDB Objects using Amazon Athena and Amazon QuickSight

If you run an operation that continuously generates a large amount of data, you may want to know what kind of data is being inserted by your application. The ability to analyze data intake quickly can be very valuable for business units, such as operations and marketing. For many operations, it’s important to see what […]

Read More

Twelve Best Practices for Amazon Redshift Spectrum

Amazon Redshift Spectrum enables you to run Amazon Redshift SQL queries on data that is stored in Amazon S3. With Redshift Spectrum, you can extend the analytic power of Amazon Redshift beyond the data that is stored natively in Amazon Redshift. Redshift Spectrum offers several capabilities that widen your possible implementation strategies. For example, it expands […]

Read More

Build a Healthcare Data Warehouse Using Amazon EMR, Amazon Redshift, AWS Lambda, and OMOP

In the healthcare field, data comes in all shapes and sizes. Despite efforts to standardize terminology, some concepts (e.g., blood glucose) are still often depicted in different ways. This post demonstrates how to convert an openly available dataset called MIMIC-III, which consists of de-identified medical data for about 40,000 patients, into an open source data […]

Read More

Near Zero Downtime Migration from MySQL to DynamoDB

Many companies consider migrating from relational databases like MySQL to Amazon DynamoDB, a fully managed, fast, highly scalable, and flexible NoSQL database service. For example, DynamoDB can increase or decrease capacity based on traffic, in accordance with business needs. The total cost of servicing can be optimized more easily than for the typical media-based RDBMS. […]

Read More

Manage Query Workloads with Query Monitoring Rules in Amazon Redshift

This blog post has been translated into Japanese and Chinese. Data warehousing workloads are known for high variability due to seasonality, potentially expensive exploratory queries, and the varying skill levels of SQL developers. To obtain high performance in the face of highly variable workloads, Amazon Redshift workload management (WLM) enables you to flexibly manage priorities and resource […]

Read More

Amazon Redshift Monitoring Now Supports End User Queries and Canaries

Ian Meyers is a Solutions Architecture Senior Manager with AWS The serverless Amazon Redshift Monitoring utility lets you gather important performance metrics from your Redshift cluster’s system tables and persists the results in Amazon CloudWatch. This serverless solution leverages AWS Lambda to schedule custom SQL queries and process the results. With this utility, you can use […]

Read More

Run Mixed Workloads with Amazon Redshift Workload Management

This blog post has been translated into Japanese.  Mixed workloads run batch and interactive workloads (short-running and long-running queries or reports) concurrently to support business needs or demand. Typically, managing and configuring mixed workloads requires a thorough understanding of access patterns, how the system resources are being used and performance requirements. It’s common for mixed […]

Read More

Converging Data Silos to Amazon Redshift Using AWS DMS

Organizations often grow organically—and so does their data in individual silos. Such systems are often powered by traditional RDBMS systems and they grow orthogonally in size and features. To gain intelligence across heterogeneous data sources, you have to join the data sets. However, this imposes new challenges, as joining data over dblinks or into a […]

Read More