AWS Big Data Blog

Enable federation to Amazon QuickSight with automatic provisioning of users between AWS IAM Identity Center and Microsoft Azure AD

Organizations are working towards centralizing their identity and access strategy across all their applications, including on-premises, third-party, and applications on AWS. Many organizations use identity providers (IdPs) based on OIDC or SAML-based protocols like Microsoft Azure Active Directory (Azure AD) and manage user authentication along with authorization centrally. This authorizes users to access Amazon QuickSight […]

Perform multi-cloud analytics using Amazon QuickSight, Amazon Athena Federated Query, and Microsoft Azure Synapse

In this post, we show how to use Amazon QuickSight and Amazon Athena Federated Query to build dashboards and visualizations on data that is stored in Microsoft Azure Synapse databases. Organizations today use data stores that are best suited for the applications they build. Additionally, they may also continue to use some of their legacy […]

Explore your data lake using Amazon Athena for Apache Spark

Amazon Athena now enables data analysts and data engineers to enjoy the easy-to-use, interactive, serverless experience of Athena with Apache Spark in addition to SQL. You can now use the expressive power of Python and build interactive Apache Spark applications using a simplified notebook experience on the Athena console or through Athena APIs. For interactive […]

Introducing the Cloud Shuffle Storage Plugin for Apache Spark

AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning (ML), and application development. In AWS Glue, you can use Apache Spark, an open-source, distributed processing system for your data integration tasks and big data workloads. Apache Spark utilizes in-memory caching and optimized […]

Centrally manage access and permissions for Amazon Redshift data sharing with AWS Lake Formation

Today’s global, data-driven organizations treat data as an asset and use it across different lines of business (LOBs) to drive timely insights and better business decisions. Amazon Redshift data sharing allows you to securely share live, transactionally consistent data in one Amazon Redshift data warehouse with another Amazon Redshift data warehouse within the same AWS […]

Data: The genesis for modern invention

It only takes one groundbreaking invention—one iconic idea that solves a widespread pain point for customers—to create or transform an industry forever. From the invention of the telegraph, to the discovery of GPS, to the earliest cloud computing services, history is filled with examples of these “eureka” moments that continue to have long-lasting impacts on […]

Log analytics the easy way with Amazon OpenSearch Serverless

We recently announced the preview release of Amazon OpenSearch Serverless, a new serverless option for Amazon OpenSearch Service, which makes it easy for you to run large-scale search and analytics workloads without having to configure, manage, or scale OpenSearch clusters. It automatically provisions and scales the underlying resources to deliver fast data ingestion and query […]

New analytical questions available in Amazon QuickSight Q: “Why” and “Forecast”

Amazon QuickSight Q uses machine learning (ML) to enable any user to ask questions about business data in natural language and receive accurate answers with relevant visualizations in seconds. Today, Amazon QuickSight announces support for two new question types that simplify and scale complex analytical tasks using natural language: “forecast” and “why.” In this post, […]

Simplify data loading on the Amazon Redshift console with Informatica Data Loader

Amazon Redshift is a fast, petabyte-scale cloud data warehouse delivering the best price–performance. Tens of thousands of customers use Amazon Redshift to process exabytes of data every day to power their analytics workloads. Data engineers, data analysts, and data scientists want to use this data to power analytics workloads such as business intelligence (BI), predictive […]

Run queries concurrently and see query history using Amazon Redshift Query Editor v2

Amazon Redshift is a fast, fully managed, petabyte-scale cloud data warehouse. You have the flexibility to choose from provisioned and serverless compute modes. You can start loading and querying large datasets conveniently in Amazon Redshift using Amazon Redshift Query Editor v2, a web-based SQL client application. Query Editor v2 empowers your technical and business teams […]