AWS Big Data Blog

Category: Amazon Redshift

How Swisscom automated Amazon Redshift as part of their One Data Platform solution using AWS CDK – Part 2

In this series, we talk about Swisscom’s journey of automating Amazon Redshift provisioning as part of the Swisscom One Data Platform (ODP) solution using the AWS Cloud Development Kit (AWS CDK), and we provide code snippets and the other useful references. In Part 1, we did a deep dive on provisioning a secure and compliant […]

How Swisscom automated Amazon Redshift as part of their One Data Platform solution using AWS CDK – Part 1

In this post, we deep dive into provisioning a secure and compliant Redshift cluster using the AWS CDK and discuss the best practices of secret rotation. We also explain how Swisscom used AWS CDK custom resources in automating the creation of dynamic user groups that are relevant for the AWS Identity and Access management (IAM) roles matching different job functions.

Integrate Tableau and Okta with Amazon Redshift using AWS IAM Identity Center

This blog post is co-written with Sid Wray and Jake Koskela from Salesforce, and Adiascar Cisneros from Tableau.  Amazon Redshift is a fast, scalable cloud data warehouse built to serve workloads at any scale. With Amazon Redshift as your data warehouse, you can run complex queries using sophisticated query optimization to quickly deliver results to […]

Migrate a petabyte-scale data warehouse from Actian Vectorwise to Amazon Redshift

Amazon Redshift is a fast, scalable, and fully managed cloud data warehouse that allows you to process and run your complex SQL analytics workloads on structured and semi-structured data. It also helps you securely access your data in operational databases, data lakes, or third-party datasets with minimal movement or copying of data. Tens of thousands […]

Build a decentralized semantic search engine on heterogeneous data stores using autonomous agents

In this post, we show how to build a Q&A bot with RAG (Retrieval Augmented Generation). RAG uses data sources like Amazon Redshift and Amazon OpenSearch Service to retrieve documents that augment the LLM prompt. For getting data from Amazon Redshift, we use the Anthropic Claude 2.0 on Amazon Bedrock, summarizing the final response based on pre-defined prompt template libraries from LangChain. To get data from Amazon OpenSearch Service, we chunk, and convert the source data chunks to vectors using Amazon Titan Text Embeddings model.

Understanding Apache Iceberg on AWS with the new technical guide

We’re excited to announce the launch of the Apache Iceberg on AWS technical guide. Whether you are new to Apache Iceberg on AWS or already running production workloads on AWS, this comprehensive technical guide offers detailed guidance on foundational concepts to advanced optimizations to build your transactional data lake with Apache Iceberg on AWS.

Figure 1 – Map built with CARTO Builder and the native support to visualize H3 indexes

Breaking barriers in geospatial: Amazon Redshift, CARTO, and H3

In this post, we discuss how Amazon Redshift spatial index functions such as Hexagonal hierarchical geospatial indexing system (or H3) can be used to represent spatial data using H3 indexing for fast spatial lookups at scale. Navigating the vast landscape of data-driven insights has always been an exciting endeavor. As technology continues to evolve, one specific facet of this journey is reaching unprecedented proportions: geospatial data.

Achieve peak performance and boost scalability using multiple Amazon Redshift serverless workgroups and Network Load Balancer

As data analytics use cases grow, factors of scalability and concurrency become crucial for businesses. Your analytic solution architecture should be able to handle large data volumes at high concurrency and without compromising speed, thereby delivering a scalable high-performance analytics environment. Amazon Redshift Serverless provides a fully managed, petabyte-scale, auto scaling cloud data warehouse to […]

Revolutionizing data querying: Amazon Redshift and Visual Studio Code integration

In today’s data-driven landscape, the efficiency and accessibility of querying tools play a crucial role in driving businesses forward. Amazon Redshift recently announced integration with Visual Studio Code (), an action that transforms the way data practitioners engage with Amazon Redshift and reshapes your interactions and practices in data management. This innovation not only unlocks […]

Orchestrate an end-to-end ETL pipeline using Amazon S3, AWS Glue, and Amazon Redshift Serverless with Amazon MWAA

Amazon Managed Workflows for Apache Airflow (Amazon MWAA) is a managed orchestration service for Apache Airflow that you can use to set up and operate data pipelines in the cloud at scale. Apache Airflow is an open source tool used to programmatically author, schedule, and monitor sequences of processes and tasks, referred to as workflows. […]