AWS Big Data Blog

Category: Amazon SageMaker Unified Studio

How Swiss Life Germany automated data governance and collaboration with Amazon SageMaker

Swiss Life Germany, a leading provider of customized pension products with over 100 years of experience, recently transitioned from legacy on-premises infrastructure to a modern cloud architecture. To enable secure data sharing and cross-departmental collaboration in this regulated environment, they implemented Amazon SageMaker with a custom Terraform pattern. This post demonstrates how Swiss Life Germany aligned SageMaker’s agility with their rigorous infrastructure as code standards, providing a blueprint for platform engineers and data architects in highly regulated enterprises.

Implement a data mesh pattern in Amazon SageMaker Catalog without changing applications

In this post, we walk through simulating a scenario based on data producer and data consumer that exists before Amazon SageMaker Catalog adoption. We use a sample dataset to simulate existing data and an existing application using an AWS Lambda function, then implement a data mesh pattern using Amazon SageMaker Catalog while keeping your current data repositories and consumer applications unchanged.

Using Amazon SageMaker Unified Studio Identity center (IDC) and IAM-based domains together

In this post, we demonstrate how to access an Amazon SageMaker Unified Studio IDC-based domain with a new IAM-based domain using role reuse and attribute-based access control.

This architecture diagram illustrates a comprehensive, end-to-end data processing pipeline built on AWS services, orchestrated through Amazon SageMaker Unified Studio. The pipeline demonstrates best practices for data ingestion, transformation, quality validation, advanced processing, and analytics.

Orchestrate end-to-end scalable ETL pipeline with Amazon SageMaker workflows

This post explores how to build and manage a comprehensive extract, transform, and load (ETL) pipeline using SageMaker Unified Studio workflows through a code-based approach. We demonstrate how to use a single, integrated interface to handle all aspects of data processing, from preparation to orchestration, by using AWS services including Amazon EMR, AWS Glue, Amazon Redshift, and Amazon MWAA. This solution streamlines the data pipeline through a single UI.

Build a trusted foundation for data and AI using Alation and Amazon SageMaker Unified Studio

The Alation and SageMaker Unified Studio integration helps organizations bridge the gap between fast analytics and ML development and the governance requirements most enterprises face. By cataloging metadata from SageMaker Unified Studio in Alation, you gain a governed, discoverable view of how assets are created and used. In this post, we demonstrate who benefits from this integration, how it works, the specific metadata it synchronizes, and provide a complete deployment guide for your environment.

Get started faster with one-click onboarding, serverless notebooks, and AI agents in Amazon SageMaker Unified Studio

Using Amazon SageMaker Unified Studio serverless notebooks, AI-assisted development, and unified governance, you can speed up your data and AI workflows across data team functions while maintaining security and compliance. In this post, we walk you through how these new capabilities in SageMaker Unified Studio can help you consolidate your fragmented data tools, reduce time to insight, and collaborate across your data teams.

Power up your analytics with Amazon SageMaker Unified Studio integration with Tableau, Power BI, and more

In this post, we guide you through connecting various analytics tools to Amazon SageMaker Unified Studio using the Athena JDBC driver, enabling seamless access to your subscribed data within your Amazon SageMaker Unified Studio projects.

Navigating architectural choices for a lakehouse using Amazon SageMaker

Over time, several distinct lakehouse approaches have emerged. In this post, we show you how to evaluate and choose the right lakehouse pattern for your needs. A lakehouse architecture isn’t about choosing between a data lake and a data warehouse. Instead, it’s an approach to interoperability where both frameworks coexist and serve different purposes within a unified data architecture. By understanding fundamental storage patterns, implementing effective catalog strategies, and using native storage capabilities, you can build scalable, high-performance data architectures that support both your current analytics needs and future innovation.

Use Amazon SageMaker custom tags for project resource governance and cost tracking

Amazon SageMaker announced a new feature that you can use to add custom tags to resources created through an Amazon SageMaker Unified Studio project. This helps you enforce tagging standards that conform to your organization’s service control policies (SCPs) and helps enable cost tracking reporting practices on resources created across the organization. In this post, we look at use cases for custom tags and how to use the AWS Command Line Interface (AWS CLI) to add tags to project resources.