AWS Machine Learning Blog
Secure access to Amazon SageMaker Studio with AWS SSO and a SAML application
Cloud security at AWS is the highest priority. Amazon SageMaker Studio offers various mechanisms to protect your data and code using integration with AWS security services like AWS Identity and Access Management (IAM), AWS Key Management Service (AWS KMS), or network isolation with Amazon Virtual Private Cloud (Amazon VPC).
Customers in highly regulated industries, like financial services, can set up Studio in VPC only mode to enable network isolation and disable internet access from Studio notebooks. You can use IAM integration with Studio to control which users have access to resources like Studio notebooks, the Studio IDE, or Amazon SageMaker training jobs.
A popular use case is to restrict access to the Studio IDE to only users from inside a specified network CIDR range or a designated VPC. You can achieve this by implementing IAM identity-based SageMaker policies and attaching those policies to the IAM users or groups that require those permissions. However, the SageMaker domain must be configured with IAM authentication mode, because the IAM identity-based policies aren’t supported in AWS Single Sign-On (SSO) authentication mode.
Many customers use AWS SSO to enable centralized workforce identity control and provide a consistent user sign-in experience. This post shows how to implement this use case while keeping AWS SSO capabilities to access Studio.
Solution overview
When you set up a SageMaker domain in VPC-only mode and specify the subnets and security groups, SageMaker creates elastic network interfaces (ENIs) that are associated with your security groups in the specified subnets. ENIs allow your training containers to connect to resources in your VPC.
In this mode, the direct internet access from notebooks is completely disabled, and all the traffic is routed through an ENI in your private VPC. This also includes traffic from Studio UI widgets and interfaces—such as experiment management, autopilot, and model monitor—to their respective backend SageMaker APIs. AWS recommends using VPC only mode to exercise fine-grained control on network access of Studio.
The first challenge is that even though Studio is deployed with no internet connectivity, Studio IDE can still be accessed from anywhere, assuming access to the AWS Management Console and Studio is granted to an IAM principal. This situation isn’t acceptable if you want to fully isolate Studio from a public network and contain all communication within a tightly controlled private VPC.
To address this challenge and disable any access to Studio IDE except from a designated VPC or a CIDR range, you can use the CreatePresignedDomainUrl SageMaker API. The IAM role or user used to call this API defines the permissions to access Studio. Now you can use IAM identity-based policies to implement the desired access configuration. For example, to enable access only from a designated VPC, add the following condition to the IAM policy, associated with an IAM principal, which is used to generate a presigned domain URL:
To enable access only from a designated VPC endpoint or endpoints, specify the following condition:
Use the following condition to restrict access from a designated CIDR range:
The second challenge is this that IAM-based access control works only when the SageMaker domain is configured in IAM authentication mode; you can’t use it when the SageMaker domain is deployed in AWS SSO mode. The next section shows how to address these challenges and implement IAM-based access control with AWS SSO access to Studio.
Architecture overview
Studio is published as a SAML application, which is assigned to a specific SageMaker Studio user profile. Users can conveniently access Studio directly from the AWS SSO portal, as shown in the following screenshot.
The solution integrates with a custom SAML 2.0 application as the mechanism to trigger the user authentication for Studio. It requires that the custom SAML application is configured with the Amazon API Gateway endpoint URL as its Assertion Consumer Service (ACS), and needs mapping attributes containing the AWS SSO user ID as well as the SageMaker domain ID.
The API Gateway endpoint calls an AWS Lambda function that parses the SAML response to extract the domain ID and user ID and use them to generate a Studio presigned URL. The Lambda function finally performs a redirection via an HTTP 302 response to sign in the user in Studio.
An IAM policy controls the network environment that Studio users are allowed to log in from, which includes restricting conditions as described in the previous section. This IAM policy is attached to the Lambda function. The IAM policy contains a permission to call the sagemaker:CreatePresignedDomainURL
API for a specific user profile only:
The following diagram shows the solution architecture.
The solution deploys a SageMaker domain into your private VPC and VPC endpoints to access Studio, SageMaker runtime, and the SageMaker API via a private connection without need for an internet gateway. The VPC endpoints are configured with private DNS enabled (PrivateDnsEnabled=True
) to associate a private hosted zone with your VPC. This enables Studio to access the SageMaker API using the default public DNS name api.sagemaker.<Region>.amazonaws.com
resolved to the private IP address of the endpoint rather than using the VPC endpoint URL.
You need to add VPC endpoints to your VPC if you want to access any other AWS services like Amazon Simple Storage Service (Amazon S3), Amazon Elastic Container Registry (Amazon ECR), AWS Security Token Service (AWS STS), AWS CloudFormation, or AWS CodeCommit.
You can fully control permissions used to generate the presigned URL and any other API calls with IAM policies attached to the Lambda function execution role or control access to any used AWS service via VPC endpoint policies. For examples of using IAM policies to control access to Studio and SageMaker API, refer to Control Access to the SageMaker API by Using Identity-based Policies.
Although the solution requires the Studio domain to be deployed in IAM mode, it does allow for AWS SSO to be used as the mechanism for end users to log in to Studio.
The following subsections contain detailed descriptions of the main solution components.
API Gateway
The API Gateway endpoint acts as the target for the application ACS URL configured in the custom SAML 2.0 application. The endpoint is private, and has a resource called /saml
and a POST method with integration request configured as Lambda proxy. The solution uses a VPC endpoint with a configured com.amazonaws.<region>.execute-api
DNS name to call this API endpoint from within the VPC.
AWS SSO
A custom SAML 2.0 application is configured with the API Gateway endpoint URL https:/{ restapi-id}.execute-api.amazonaws.com/saml
as its application ACS URL, and uses attribute mappings with the following requirements:
- User identifier:
- User attribute in the application – user name
- Maps user attribute in AWS SSO –
${user:AD_GUID}
- SageMaker domain ID identifier:
- User attribute in the application –
domain-id
- Maps user attribute in AWS SSO – Domain ID for the Studio instance
- User attribute in the application –
The application implements the access control for an AWS SSO user by provisioning a Studio user profile with the name equal to the AWS SSO user ID.
Lambda function
The solution configures a Lambda function as an invocation point for the API Gateway /saml
resource. The function parses the SAMLResponse
sent by AWS SSO, extracts the domain-id
as well as the user name, and calls the createPresignedDomainUrl
SageMaker API to retrieve the Studio URL and token and redirect the user to log in using an HTTP 302 response. The Lambda function has a specific IAM policy attached to its execution role that allows the sagemaker:createPresignedDomainUrl
action only when it’s requested from a specific network CIDR range using the VpcSourceIp
condition.
The Lambda function doesn’t have any logic to validate the SAML response, for example to check a signature. However, because the API Gateway endpoint serving as the ACS is private or internal only, it’s not mandatory for this proof of concept environment.
Deploy the solution
The GitHub repository provides the full source code for the end-to-end solution.
To deploy the solution, you must have administrator (or power user) permissions for an AWS account, and install the AWS Command Line Interface (AWS CLI) and AWS SAM CLI and minimum Python 3.8.
The solution supports deployment to three AWS Regions: eu-west-1
, eu-central-1
, and us-east-1
. Make sure you select one of these Regions for deployment.
To start testing the solution, you must complete the following deployment steps from the solution’s GitHub README file:
- Set up AWS SSO if you don’t have it configured.
- Deploy the solution using the SAM application.
- Create a new custom SAML 2.0 application.
After you complete the deployment steps, you can proceed with the solution test.
Test the solution
The solution simulates two use cases to demonstrate the usage of AWS SSO and SageMaker identity-based policies:
- Positive use case – A user accesses Studio from within a designated CIDR range through a VPC endpoint
- Negative use case – A user accesses Studio from a public IP address
To test these use cases, the solution created three Amazon Elastic Compute Cloud (Amazon EC2) instances:
- Private host – An EC2 Windows instance in a private subnet that is able to access Studio (your on-premises secured environment)
- Bastion host – An EC2 Linux instance in the public subnet used to establish an SSH tunnel into the private host on the private network
- Public host – An EC2 Windows instance in a public subnet to demonstrate that the user can’t access Studio from an unauthorized IP address
Test Studio access from an authorized network
Follow these steps to perform the test:
- To access the EC2 Windows instance on the private network, run the command provided as the value of the SAM output key
TunnelCommand
. Make sure that the private key of the key pair specified in the parameter is in the directory where the SSH tunnel command runs from. The command creates an SSH tunnel from the local computer onlocalhost:3389
to the EC2 Windows instance on the private network. See the following example code: - On your local desktop or notebook, open a new RDP connection (for example using Microsoft Remote Desktop) using
localhost
as the target remote host. This connection is tunneled via the bastion host to the private EC2 Windows instance. Use the user nameAdministrator
and password from the stack outputSageMakerWindowsPassword
. - Open the Firefox web browser from the remote desktop.
- Navigate and log in to the AWS SSO portal using the credentials associated with the user name that you specified as the
ssoUserName
parameter. - Choose the SageMaker Secure Demo AWS SSO application from the AWS SSO portal.
You’re redirected to the Studio IDE in a new browser window.
Test Studio access from an unauthorized network
Now follow these steps to simulate access from an unauthorized network:
- Open a new RDP connection on the IP provided in the
SageMakerWindowsPublicHost
SAML output. - Open the Firefox web browser from the remote desktop.
- Navigate and log in to the AWS SSO portal using the credentials associated with the user name that was specified as the
ssoUserName
parameter. - Choose the SageMaker Secure Demo AWS SSO application from the AWS SSO portal.
This time you receive an unauthorized access message.
Clean up
To avoid charges, you must remove all solution-provisioned and manually created resources from your AWS account. Follow the instructions in the solution’s README file.
Conclusion
We demonstrated that by introducing a middleware authentication layer between the end user and Studio, we can control the environment that user is allowed to access Studio from and explicitly block every other unauthorized environment.
To further tighten security, you can add an IAM policy to a user role to prevent access to Studio from the console. If you use AWS Organizations, you can implement the following service control policy for the organizational units or accounts that need access to Studio:
Although the solution described in this post uses API Gateway and Lambda, you can explore other ways such as an EC2 instance with an instance role using the same permission validation workflow as described or even an independent system to handle user authentication and authorization and generate a Studio presigned URL.
Further reading
Securing access to Studio is an active research topic, and there are other relevant posts on similar approaches. Refer to the following posts on the AWS Machine Learning Blog to learn more about other services and architectures you can use:
- Launch Amazon SageMaker Studio from external applications using presigned URLs
- Building secure Amazon SageMaker access URLs with AWS Service Catalog
- Mitigate data leakage through the use of AppStream 2.0 and end-to-end auditing
- Understanding Amazon SageMaker notebook instance networking configurations and advanced routing options
- Securing Amazon SageMaker Studio connectivity using a private VPC
About the Authors
Jerome Bachelet is a Solutions Architect at Amazon Web Services. He thrives on helping customers get the most value out of AWS to achieve their business objectives. Jerome has over 10 years of experience working with data protection and data security solutions. Besides being in the cloud, Jerome enjoys travels and quality time with his wife and 2 daughters in the Geneva, Switzerland area.
Yevgeniy Ilyin is a Solutions Architect at AWS. He has over 20 years of experience working at all levels of software development and solutions architecture and has used programming languages from COBOL and Assembler to .NET, Java, and Python. He develops and codes cloud native solutions with a focus on big data, analytics, and data engineering.