AWS Big Data Blog

Normalize data with Amazon Elasticsearch Service ingest pipelines

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. Amazon OpenSearch Service is a fully managed service that makes it easy for you to deploy, secure, and run Elasticsearch cost-effectively at scale. Search and log analytics are the two most popular use cases for Amazon OpenSearch Service. In log analytics […]

Enabling Amazon QuickSight federation with Microsoft Entra ID (formerly Azure AD)

June 2025: This post was reviewed and updated for accuracy. As of August 2023, Amazon QuickSight is now an AWS IAM Identity Center enabled application. This capability allows administrators who subscribe to QuickSight to use IAM Identity Center to enable their users to log in with Azure AD and other external identity providers. For more […]

Federating single sign-on access to your Amazon Redshift cluster with PingIdentity

Single sign-on (SSO) enables users to have a seamless user experience while accessing various applications in the organization. If you’re responsible for setting up security and database access privileges for users and tasked with enabling SSO for Amazon Redshift, you can set up SSO authentication using ADFS, PingIdentity, Okta, Azure AD or other SAML browser […]

How Cookpad scaled its Amazon Redshift cluster while controlling costs with usage limits

This is a guest post by Shimpei Kodama, data engineer at Cookpad Inc. Cookpad is a tech company that builds a community platform where people share recipe ideas and cooking tips. The company’s mission is to “make everyday cooking fun.” It’s one of the largest recipe-sharing platforms in Japan with over 50 million users per […]

Automating bucketing of streaming data using Amazon Athena and AWS Lambda

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. In today’s world, data plays a vital role in helping businesses understand and improve their processes and services to reduce cost. You can use several tools to […]

Best practices using AWS SCT and AWS Snowball to migrate from Teradata to Amazon Redshift

This is a guest post from ZS. In their own words, “ZS is a professional services firm that works closely with companies to help develop and deliver products and solutions that drive customer value and company results. ZS engagements involve a blend of technology, consulting, analytics, and operations, and are targeted toward improving the commercial […]

Bringing the power of embedded analytics to your apps and services with Amazon QuickSight

In the world we live in today, companies need to quickly react to change—and to anticipate it. Customers tell us that their reliance on data has never been greater than what it is today. To improve your decision-making, you have two types of data transformation needs: data agility, the speed at which data turns into […]

Building an AWS Glue ETL pipeline locally without an AWS account

This blog was last reviewed May, 2022. If you’re new to AWS Glue and looking to understand its transformation capabilities without incurring an added expense, or if you’re simply wondering if AWS Glue ETL is the right tool for your use case and want a holistic view of AWS Glue ETL functions, then please continue […]

How to delete user data in an AWS data lake

General Data Protection Regulation (GDPR) is an important aspect of today’s technology world, and processing data in compliance with GDPR is a necessity for those who implement solutions within the AWS public cloud. One article of GDPR is the “right to erasure” or “right to be forgotten” which may require you to implement a solution […]