AWS Big Data Blog

Category: Amazon SageMaker Unified Studio

AWS analytics at re:Invent 2025: Unifying Data, AI, and governance at scale

re:Invent 2025 showcased the bold Amazon Web Services (AWS) vision for the future of analytics, one where data warehouses, data lakes, and AI development converge into a seamless, open, intelligent platform, with Apache Iceberg compatibility at its core. Across over 18 major announcements spanning three weeks, AWS demonstrated how organizations can break down data silos, […]

Unifying governance and metadata across Amazon SageMaker Unified Studio and Atlan

In this post, we show you how to unify governance and metadata across Amazon SageMaker Unified Studio and Atlan through a comprehensive bidirectional integration. You’ll learn how to deploy the necessary AWS infrastructure, configure secure connections, and set up automated synchronization to maintain consistent metadata across both platforms.

How Bayer transforms Pharma R&D with a cloud-based data science ecosystem using Amazon SageMaker

In this post, we discuss how Bayer AG used the next generation of Amazon SageMaker to build a cloud-based Pharma R&D Data Science Ecosystem (DSE) that unified data ingestion, storage, analytics, and AI/ML workflows.

Orchestrating data processing tasks with a serverless visual workflow in Amazon SageMaker Unified Studio

In this post, we show how to use the new visual workflow experience in SageMaker Unified Studio IAM-based domains to orchestrate an end-to-end machine learning workflow. The workflow ingests weather data, applies transformations, and generates predictions—all through a single, intuitive interface, without writing any orchestration code.

Cross-account lakehouse governance with Amazon S3 Tables and SageMaker Catalog

In this post, we walk you through a practical solution for secure, efficient cross-account data sharing and analysis. You’ll learn how to set up cross-account access to S3 Tables using federated catalogs in Amazon SageMaker, perform unified queries across accounts with Amazon Athena in Amazon SageMaker Unified Studio, and implement fine-grained access controls at the column level using AWS Lake Formation.

Enhanced search with match highlights and explanations in Amazon SageMaker

Amazon SageMaker now enhances search results in Amazon SageMaker Unified Studio with additional context that improves transparency and interpretability. The capability introduces inline highlighting for matched terms and an explanation panel that details where and how each match occurred across metadata fields such as name, description, glossary, and schema. In this post, we demonstrate how to use enhanced search in Amazon SageMaker.

Use trusted identity propagation for Apache Spark interactive sessions in Amazon SageMaker Unified Studio

In this post, we provide step-by-step instructions to set up Amazon EMR on EC2, EMR Serverless, and AWS Glue within SageMaker Unified Studio, enabled with trusted identity propagation. We use the setup to illustrate how different IAM Identity Center users can run their Spark sessions, using each compute setup, within the same project in SageMaker Unified Studio. We show how each user will see only tables or part of tables that they’re granted access to in Lake Formation.

Accelerate data governance with custom subscription workflows in Amazon SageMaker

Organizations need to efficiently manage data assets while maintaining governance controls in their data marketplaces. Although manual approval workflows remain important for sensitive datasets and production systems, there’s an increasing need for automated approval processes with less sensitive datasets. In this post, we show you how to automate subscription request approvals within SageMaker, accelerating data access for data consumers.