AWS Big Data Blog

solution architecture and user flow.

Manage users and group memberships on Amazon QuickSight using SCIM events generated in IAM Identity Center with Azure AD

Amazon QuickSight is cloud-native, scalable business intelligence (BI) service that supports identity federation. AWS Identity and Access Management (IAM) allows organizations to use the identities managed in their enterprise identity provider (IdP) and federate single sign-on (SSO) to QuickSight. As more organizations are building centralized user identity stores with all their applications, including on-premises apps, […]

How AWS Payments migrated from Redash to Amazon Redshift Query Editor v2

AWS Payments is part of the AWS Commerce Platform (CP) organization that owns the customer experience of paying AWS invoices. It helps AWS customers manage their payment methods and payment preferences, and helps customers make self-service payments to AWS. The Machine Learning, Data and Analytics (MLDA) team at AWS Payments enables data-driven decision-making across payments […]

Accelerating revenue growth with real-time analytics: Poshmark’s journey

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. This post was co-written by Mahesh Pasupuleti and Gaurav Shah from Poshmark. Poshmark is a leading social marketplace for new and secondhand styles for women, men, kids, […]

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 2: AWS Glue Studio Visual Editor

In the first post of this series, we described how AWS Glue for Apache Spark works with Apache Hudi, Linux Foundation Delta Lake, and Apache Iceberg datasets tables using the native support of those data lake formats. This native support simplifies reading and writing your data for these data lake frameworks so you can more […]

Extend geospatial queries in Amazon Athena with UDFs and AWS Lambda

Amazon Athena is a serverless and interactive query service that allows you to easily analyze data in Amazon Simple Storage Service (Amazon S3) and 25-plus data sources, including on-premises data sources or other cloud systems using SQL or Python. Athena built-in capabilities include querying for geospatial data; for example, you can count the number of […]

Amazon QuickSight helps TalentReef empower its customers to make more informed hiring decisions

This post is co-written with Alexander Plumb, Product Manager at Mitratech. TalentReef, now part of Mitratech, is a talent management platform purpose-built for location-based, high-volume hiring. TalentReef was acquired by Mitratech in August 2022 with the goal to combine TalentReef’s best-in-class systems with Mitratech’s expertise, technology, and global platform to ensure their customers’ hiring needs […]

How SafetyCulture scales unpredictable dbt Cloud workloads in a cost-effective manner with Amazon Redshift

This post is co-written by Anish Moorjani, Data Engineer at SafetyCulture. SafetyCulture is a global technology company that puts the power of continuous improvement into everyone’s hands. Its operations platform unlocks the power of observation at scale, giving leaders visibility and workers a voice in driving quality, efficiency, and safety improvements. Amazon Redshift is a […]

How Infomedia built a serverless data pipeline with change data capture using AWS Glue and Apache Hudi

This is a guest post co-written with Gowtham Dandu from Infomedia. Infomedia Ltd (ASX:IFM) is a leading global provider of DaaS and SaaS solutions that empowers the data-driven automotive ecosystem. Infomedia’s solutions help OEMs, NSCs, dealerships and 3rd party partners manage the vehicle and customer lifecycle. They are used by over 250,000 industry professionals, across […]

Accelerate data insights with Elastic and Amazon Kinesis Data Firehose

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. This is a guest post co-written with Udayasimha Theepireddy from Elastic. Processing and analyzing log and Internet of Things (IoT) data can be challenging, especially when dealing with large volumes of real-time […]

Role-based access control in Amazon OpenSearch Service via SAML integration with AWS IAM Identity Center

Amazon OpenSearch Service is a managed service that makes it simple to secure, deploy, and operate OpenSearch clusters at scale in the AWS Cloud. AWS IAM Identity Center (successor to AWS Single Sign-On) helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. To build a […]