AWS Big Data Blog

Category: Intermediate (200)

Configure monitoring, limits, and alarms in Amazon Redshift Serverless to keep costs predictable

Amazon Redshift Serverless makes it simple to run and scale analytics in seconds. It automatically provisions and intelligently scales data warehouse compute capacity to deliver fast performance, and you pay only for what you use. Just load your data and start querying right away in the Amazon Redshift Query Editor or in your favorite business […]

Query your Apache Hive metastore with AWS Lake Formation permissions

Apache Hive is a SQL-based data warehouse system for processing highly distributed datasets on the Apache Hadoop platform. There are two key components to Apache Hive: the Hive SQL query engine and the Hive metastore (HMS). The Hive metastore is a repository of metadata about the SQL tables, such as database names, table names, schema, […]

Dimensional modeling in Amazon Redshift

Amazon Redshift is a fully managed and petabyte-scale cloud data warehouse that is used by tens of thousands of customers to process exabytes of data every day to power their analytics workload. You can structure your data, measure business processes, and get valuable insights quickly can be done by using a dimensional model. Amazon Redshift […]

Migrate data from Google Cloud Storage to Amazon S3 using AWS Glue

Today, we are pleased to announce a new AWS Glue connector for Google Cloud Storage that allows you to move data bi-directionally between Google Cloud Storage and Amazon Simple Storage Service (Amazon S3). In this post, we go over how the new connector works, introduce the connector’s functions, and provide you with key steps to set it up. We provide you with prerequisites, share how to subscribe to this connector in AWS Marketplace, and describe how to create and run AWS Glue for Apache Spark jobs with it.

Automate secure access to Amazon MWAA environments using existing OpenID Connect single-sign-on authentication and authorization

Customers use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to run Apache Airflow at scale in the cloud. They want to use their existing login solutions developed using OpenID Connect (OIDC) providers with Amazon MWAA; this allows them to provide a uniform authentication and single sign-on (SSO) experience using their adopted identity providers (IdP) […]

Introducing field-based coloring experience for Amazon QuickSight

Color plays a crucial role in visualizations. It conveys meaning, captures attention, and enhances aesthetics. You can quickly grasp important information when key insights and data points pop with color. However, it’s important to use color judiciously to enhance readability and ensure correct interpretation. Color should also be accessible and consistent to enable users to […]

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

Amazon Finance Automation (FinAuto) is the tech organization of Amazon Finance Operations (FinOps). Its mission is to enable FinOps to support the growth and expansion of Amazon businesses. It works as a force multiplier through automation and self-service, while providing accurate and on-time payments and collections. FinAuto has a unique position to look across FinOps […]

Configure end-to-end data pipelines with Etleap, Amazon Redshift, and dbt

This blog post is co-written with Zygimantas Koncius from Etleap. Organizations use their data to extract valuable insights and drive informed business decisions. With a wide array of data sources, including transactional databases, log files, and event streams, you need a simple-to-use solution capable of efficiently ingesting and transforming large volumes of data in real […]

Level up your React app with Amazon QuickSight: How to embed your dashboard for anonymous access

Using embedded analytics from Amazon QuickSight can simplify the process of equipping your application with functional visualizations without any complex development. There are multiple ways to embed QuickSight dashboards into application. In this post, we look at how it can be done using React and the Amazon QuickSight Embedding SDK. Dashboard consumers often don’t have […]

Getting started guide for near-real time operational analytics using Amazon Aurora zero-ETL integration with Amazon Redshift

November 2023: This post was reviewed and updated to include the latest enhancements in Aurora MySQL zero-ETL integration with Amazon Redshift on general availability (GA). Amazon Aurora zero-ETL integration with Amazon Redshift was announced at AWS re:Invent 2022 and is now generally available (GA) for Aurora MySQL 3.05.0 (compatible with MySQL 8.0.32) and higher version […]