Automate a centralized deployment of Amazon SageMaker Studio with AWS Service Catalog

This post outlines the best practices for provisioning Amazon SageMaker Studio for data science teams and provides reference architectures and AWS CloudFormation templates to help you get started. We use AWS Service Catalog to provision a Studio domain and users. The AWS Service Catalog allows you to provision these centrally without requiring each user to obtain Amazon SageMaker access policies to provision Studio separately.

SageMaker is a fully managed service that provides every machine learning (ML) developer and data scientist with the ability to build, train, and deploy ML models quickly. Studio is a web-based integrated development environment (IDE) for ML that lets you build, train, debug, deploy, and monitor your ML models. Studio provides all the tools that you need to take your models from experimentation to production while boosting your productivity. You can write code, track experiments, visualize data, and debug and monitor within a single, integrated visual interface.

Studio supports three authentication modes: AWS Identity and Access Management (IAM), federated single sign-on (SSO) and AWS Single Sign-On (AWS SSO). The steps outlined in this post apply to IAM and federated SSO modes only. To provision Studio using AWS SSO, see Onboard to Amazon SageMaker Studio Using AWS SSO.

Studio key components

Let’s start by looking at the key components within Studio that we need to consider while provisioning:

Domain – A primary component of Studio is a domain. The domain consists of a list of authorized users (called user profiles), and configurations such as Amazon Virtual Private Cloud (Amazon VPC) configurations and the default IAM execution role.
User profile – The user profile is a configuration for a user that exists in the SageMaker domain. The user profile defines various configuration settings for the user, such as the execution role and default app specifications.
Execution role – The IAM execution role is the primary role that is assumed by the users and the service on behalf of the user to perform certain actions and provision resources in Studio.

Data science user personas

The following section describes two different personas that interact with Studio resources and the level of access they need to fulfill their duties. We use this as a high-level requirement to model IAM roles and policies to establish desired controls based on resource ownership at the team and user level.

Cloud admin – This team is also responsible for building and maintaining the infrastructure for supporting ML services, such as provisioning notebooks for data scientists to use, creating Amazon Simple Storage Service (Amazon S3) buckets for storing data, managing costs for ML from various lines of business (LOBs), and more.
Data scientist – Data scientists within an AI center of excellence (COE) or embedded within the LOBs are responsible for building, training, and deploying models. In regulated industries, data scientists need to adhere to the organization’s security boundaries, such as using S3 buckets for data access, using private networking for accessing APIs, committing code to source control, ensuring all their experiments and trials are properly logged, enforcing encryption of data in transit, and monitoring deployed models.

Architecture overview

The following architecture diagram shows the necessary permissions and flow for the two user personas, cloud admin and data scientist.

The architecture has the following workflow:

The cloud admin signs in to the AWS Management Console with the cloud admin role. This role enables them to access AWS Service Catalog to launch products (for this post, a Studio domain and user profile).
The provisioning role is assumed by SageMaker and has the permissions to create a Studio domain and user profile.
The data scientist signs in to the console with the data scientist role. The role has permissions to create a pre-signed URL to enable the data scientist to log in to Studio.
Studio assumes a Studio execution role as defined in the data scientists’ user profile, which has the necessary permissions to create or launch a JupyterServer app to complete the user login process.

Prerequisites

Before you get started, complete the following prerequisites:

Create a VPC and subnet ahead of time.
Make sure a Studio domain isn’t already set up in the Region you’re working in, because currently you can have only one domain per Region. The templates in this post help you create a Studio domain from scratch. If you already have a domain, you can skip that step. To create a new domain, first delete the existing domain in that Region if one already exists.

Create the cloud admin role

The cloud admin role needs full access to AWS Service Catalog, but not SageMaker. When you create the role, make sure that the ServiceCatalogAdminFullAccess managed policy is attached. When cloud admins initiate product provisioning from AWS Service Catalog, the provisioning role is assumed. You don’t need to create the provisioning role; it’s automatedly provisioned by our sample templates along with the AWS Service Catalog products (which we discuss in the next section).

Populate AWS Service Catalog products

AWS Service Catalog allows you to create and manage a catalog of services to be provisioned under an AWS account. You can use a CloudFormation template to define how to provision a specific service and release it as an AWS Service Catalog product. Products also can be organized as product portfolios.

After the product has been populated to AWS Service Catalog, users (with proper access rights) can provision these products self-service. The user doesn’t need access to the service being provisioned (with AWS Service Catalog products, you also can set up an execution role used for provisioning), it just needs access to the AWS Service Catalog products.

In this post, we use AWS Service Catalog to provision Studio and onboard Studio users. We provide the underlying CloudFormation templates for SageMaker products as well as a launch template to populate the SageMaker-related products into AWS Service Catalog. You can find these templates in the GitHub repo.

Complete the following steps to run the launch template, which populates the AWS Service Catalog products:

Choose Launch Stack:

By default, the template launches in us-west-2, but you can switch to another Region before starting the template.

Choose Next.
For LaunchPrincipal, enter a user, group, or role to whom you want to grant provisioning access to the AWS Service Catalog products. For this post, we enter the cloud admin role, or you can set up the ARN of the cloud admin user directly.

Users who assume this role have access to initiate product provisioning.

Choose Next.

Select the check-box to acknowledge that the template will create IAM resources.
Choose Create stack.
On the AWS Service Catalog console, under Administration in the navigation pane, choose Products.
Confirm that the Studio products have been provisioned.

Provision a Studio domain

To provision your Studio domain, complete the following steps:

On the AWS Service Catalog console, under Administration in the navigation pane, choose Product
Select your product and choose Launch product.

Enter a name for the product, your VPC, and subnet IDs used for SageMaker communication.
Choose Launch product.

Provision SageMaker user profiles

After you create your Studio domain, you can start provisioning your Studio users. To provision a new user profile via AWS Service Catalog, complete the following steps:

On the SageMaker console, choose Amazon SageMaker Studio.
On the Studio Control Panel, note the domain ID of your new Studio domain.

On the AWS Service Catalog console, launch the SageMaker user profile product.
Enter the Studio user profile name and SageMaker domain ID.
Choose Launch product.
On the Studio console, and check if the new user profile has been created.
Choose Open Studio to launch Studio.

For step-by-step instructions on provisioning the Studio domain and user profiles, check out the following videos:

Provide access to users

The data scientist role needs permission to create a pre-singed URL that enables users to log in to Studio. When you provision the domain, the template also creates the data scientist policy with the pre-signed URL access. This policy can either be attached to your data scientist role or directly to the IAM users.

You don’t need additional SageMaker access policies because Studio assumes the execution role on your behalf. In our example CloudFormation templates, we provision this role for you, but you can customize based on your needs.

You can also restrict access to just those user profiles that are assigned to specific users via tags. For more information, see Configuring Amazon SageMaker Studio for teams and groups with complete resource isolation.

Clean up resources

You can use AWS Service Catalog to delete your Studio domain and user profiles.

On the AWS Service Catalog console, choose Provision products in the navigation pane.
Select the provision to delete.
On the Action menu, choose Terminate.
Choose Terminate provision product to confirm.

Conclusion

In this post, we demonstrated how you can provision Studio and onboard your Studio users via AWS Service Catalog, which provides better governance. We also demonstrated how to decouple the cloud admin role from the data scientist role. As a cloud admin, you have access to provision new resources, but the cloud admins don’t need any SageMaker access. Data scientists need SageMaker access, but they can’t provision new user profiles or Studio domains. This separation of roles leads to better isolation of concerns, governance, and security. You can find these templates in the GitHub repo.

About the Authors

Andras Garzo is a Solutions Architect in the ML Migration team, helping customers adopt Amazon SageMaker, save costs, and make their ML workload more performant.

Sam Palani is an AI/ML Specialist Solutions Architect at AWS. He enjoys working with customers to help them architect machine learning solutions at scale. When not helping customers, he enjoys reading and exploring the outdoors.

Rama Thamman is a Software Development Manager with the AI Platforms team, leading the ML Migrations team.

Artificial Intelligence