AWS Big Data Blog

Category: Amazon Redshift

Enhance your security posture by storing Amazon Redshift admin credentials without human intervention using AWS Secrets Manager integration

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more. Today, tens of thousands of AWS customers—from Fortune 500 companies, startups, and everything in between—use Amazon Redshift to run mission-critical business intelligence (BI) dashboards, […]

Migrate Microsoft Azure Synapse Analytics to Amazon Redshift using AWS SCT

In this post, we show how to migrate a data warehouse from Microsoft Azure Synapse to Redshift Serverless using AWS Schema Conversion Tool (AWS SCT) and AWS SCT data extraction agents. AWS SCT makes heterogeneous database migrations predictable by automatically converting the source database code and storage objects to a format compatible with the target database.

Accelerate your data warehouse migration to Amazon Redshift – Part 7

In this post, we describe at a high-level how CDC tasks work in AWS SCT. Then we deep dive into an example of how to configure, start, and manage a CDC migration task. We look briefly at performance and how you can tune a CDC migration, and then conclude with some information about how you can get started on your own migration.

Non-JSON ingestion using Amazon Kinesis Data Streams, Amazon MSK, and Amazon Redshift Streaming Ingestion

Organizations are grappling with the ever-expanding spectrum of data formats in today’s data-driven landscape. From Avro’s binary serialization to the efficient and compact structure of Protobuf, the landscape of data formats has expanded far beyond the traditional realms of CSV and JSON. As organizations strive to derive insights from these diverse data streams, the challenge […]

Use the new SQL commands MERGE and QUALIFY to implement and validate change data capture in Amazon Redshift

Amazon Redshift has added many features to enhance analytical processing like ROLLUP, CUBE and GROUPING SETS, which were demonstrated in the post Simplify Online Analytical Processing (OLAP) queries in Amazon Redshift using new SQL constructs such as ROLLUP, CUBE, and GROUPING SETS. Amazon Redshift has recently added many SQL commands and expressions. In this post, we talk about two new SQL features, the MERGE command and QUALIFY clause, which simplify data ingestion and data filtering.

Accelerate Amazon Redshift secure data use with Satori – Part 1

This post is co-written by Lisa Levy, Content Specialist at Satori. Data democratization enables users to discover and gain access to data faster, improving informed data-driven decisions and using data to generate business impact. It also increases collaboration across teams and organizations, breaking down data silos and enabling cross-functional teams to work together more effectively. […]

Query your Iceberg tables in data lake using Amazon Redshift

Amazon Redshift supports querying a wide variety of data formats, such as CSV, JSON, Parquet, and ORC, and table formats like Apache Hudi and Delta. Amazon Redshift also supports querying nested data with complex data types such as struct, array, and map. With this capability, Amazon Redshift extends your petabyte-scale data warehouse to an exabyte-scale data lake on Amazon S3 in a cost-effective manner. Apache Iceberg is the latest table format that is supported by Amazon Redshift. In this post, we show you how to query Iceberg tables using Amazon Redshift, and explore Iceberg support and options.

Build an ETL process for Amazon Redshift using Amazon S3 Event Notifications and AWS Step Functions

In this post we discuss how we can build and orchestrate in a few steps an ETL process for Amazon Redshift using Amazon S3 Event Notifications for automatic verification of source data upon arrival and notification in specific cases. And we show how to use AWS Step Functions for the orchestration of the data pipeline. It can be considered as a starting point for teams within organizations willing to create and build an event driven data pipeline from data source to data warehouse that will help in tracking each phase and in responding to failures quickly. Alternatively, you can also use Amazon Redshift auto-copy from Amazon S3 to simplify data loading from Amazon S3 into Amazon Redshift.

Perform time series forecasting using Amazon Redshift ML and Amazon Forecast

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. Tens of thousands of customers use Amazon Redshift to process exabytes of data every day to power their analytics workloads. Many businesses use different software tools to analyze historical data and past patterns to forecast future demand and trends to make more […]