AWS Machine Learning Blog

Host code-server on Amazon SageMaker

Machine learning (ML) teams need the flexibility to choose their integrated development environment (IDE) when working on a project. It allows you to have a productive developer experience and innovate at speed. You may even use multiple IDEs within a project. Amazon SageMaker lets ML teams choose to work from fully managed, cloud-based environments within Amazon SageMaker Studio, SageMaker Notebook Instances, or from your local machine using local mode.

SageMaker provides a one-click experience to Jupyter and RStudio to build, train, debug, deploy, and monitor ML models. In this post, we will also share a solution for hosting code-server on SageMaker.

With code-server, users can run VS Code on remote machines and access it in a web browser. For ML teams, hosting code-server on SageMaker provides minimal changes to a local development experience, and allows you to code from anywhere, on scalable cloud compute. With VS Code, you can also use built-in Conda environments with AWS-optimized TensorFlow and PyTorch, managed Git repositories, local mode, and other features provided by SageMaker to speed up your delivery. For IT admins, it allows you to standardize and expedite the provisioning of managed, secure IDEs in the cloud, to quickly onboard and enable ML teams in their projects.

Solution overview

In this post, we cover installation for both Studio environments (Option A), and notebook instances (Option B). For each option, we walk through a manual installation process that ML teams can run in their environment, and an automated installation that IT admins can set up for them via the AWS Command Line Interface (AWS CLI).

The following diagram illustrates the architecture overview for hosting code-server on SageMaker.

ml-10244-architecture-overview

Our solution speeds up the install and setup of code-server in your environment. It works for both JupyterLab 3 (recommended) and JupyterLab 1 that run within Studio and SageMaker notebook instances. It is made of shell scripts that do the following based on the option.

Note: The solution requires internet access from the Amazon SageMaker Studio environment, to download and install the required artifacts. If internet access is not available or it is limited via a proxy/firewall, you can try modifying the install scripts to download and install the artifacts (code-server, JupyterLab extensions, code-server extensions, etc.) from Amazon S3 or the local EFS file system.

For Studio (Option A), the shell script does the following:

For SageMaker notebook instances (Option B), the shell script does the following:

  • Installs code-server.
  • Adds a code-server shortcut on the Jupyter notebook file menu and JupyterLab launcher for fast access to the IDE.
  • Creates a dedicated Conda environment for managing dependencies.
  • Installs the Python and Docker extensions on the IDE.

In the following sections, we walk through the solution install process for Option A and Option B. Make sure you have access to Studio or a notebook instance.

Option A: Host code-server on Studio

To host code-server on Studio, complete the following steps:

  1. Choose System terminal in your Studio launcher.
    ml-10244-studio-terminal-click
  2. To install the code-server solution, run the following commands in your system terminal:
    curl -LO https://github.com/aws-samples/amazon-sagemaker-codeserver/releases/download/v0.1.5/amazon-sagemaker-codeserver-0.1.5.tar.gz
    tar -xvzf amazon-sagemaker-codeserver-0.1.5.tar.gz
    
    cd amazon-sagemaker-codeserver/install-scripts/studio
     
    chmod +x install-codeserver.sh
    ./install-codeserver.sh
    
    # Note: when installing on JL1, please prepend the nohup command to the install command above and run as follows: 
    # nohup ./install-codeserver.sh
    Bash

    The commands should take a few seconds to complete.

  3. Reload the browser page, where you can see a Code Server button in your Studio launcher.
    ml-10244-code-server-button
  4. Choose Code Server to open a new browser tab, allowing you to access code-server from your browser.
    The Python extension is already installed, and you can get to work in your ML project.ml-10244-vscode

You can open your project folder in VS Code and select the pre-built Conda environment to run your Python scripts.

ml-10244-vscode-conda

Automate the code-server install for users in a Studio domain

As an IT admin, you can automate the installation for Studio users by using a lifecycle configuration. It can be done for all users’ profiles under a Studio domain or for specific ones. See Customize Amazon SageMaker Studio using Lifecycle Configurations for more details.

For this post, we create a lifecycle configuration from the install-codeserver script, and attach it to an existing Studio domain. The install is done for all the user profiles in the domain.

From a terminal configured with the AWS CLI and appropriate permissions, run the following commands:

curl -LO https://github.com/aws-samples/amazon-sagemaker-codeserver/releases/download/v0.1.5/amazon-sagemaker-codeserver-0.1.5.tar.gz
tar -xvzf amazon-sagemaker-codeserver-0.1.5.tar.gz

cd amazon-sagemaker-codeserver/install-scripts/studio

LCC_CONTENT=`openssl base64 -A -in install-codeserver.sh`

aws sagemaker create-studio-lifecycle-config \
    --studio-lifecycle-config-name install-codeserver-on-jupyterserver \
    --studio-lifecycle-config-content $LCC_CONTENT \
    --studio-lifecycle-config-app-type JupyterServer \
    --query 'StudioLifecycleConfigArn'

aws sagemaker update-domain \
    --region <your_region> \
    --domain-id <your_domain_id> \
    --default-user-settings \
    '{
    "JupyterServerAppSettings": {
    "DefaultResourceSpec": {
    "LifecycleConfigArn": "arn:aws:sagemaker:<your_region>:<your_account_id>:studio-lifecycle-config/install-codeserver-on-jupyterserver",
    "InstanceType": "system"
    },
    "LifecycleConfigArns": [
    "arn:aws:sagemaker:<your_region>:<your_account_id>:studio-lifecycle-config/install-codeserver-on-jupyterserver"
    ]
    }}'

# Make sure to replace <your_domain_id>, <your_region> and <your_account_id> in the previous commands with
# the Studio domain ID, the AWS region and AWS Account ID you are using respectively.
Bash

After Jupyter Server restarts, the Code Server button appears in your Studio launcher.

Option B: Host code-server on a SageMaker notebook instance

To host code-server on a SageMaker notebook instance, complete the following steps:

  1. Launch a terminal via Jupyter or JupyterLab for your notebook instance.
    If you use Jupyter, choose Terminal on the New menu.
  2.  To install the code-server solution, run the following commands in your terminal:
    curl -LO https://github.com/aws-samples/amazon-sagemaker-codeserver/releases/download/v0.1.5/amazon-sagemaker-codeserver-0.1.5.tar.gz
    tar -xvzf amazon-sagemaker-codeserver-0.1.5.tar.gz
    
    cd amazon-sagemaker-codeserver/install-scripts/notebook-instances
     
    chmod +x install-codeserver.sh
    chmod +x setup-codeserver.sh
    sudo ./install-codeserver.sh
    sudo ./setup-codeserver.sh
    Bash

    The code-server and extensions installations are persistent on the notebook instance. However, if you stop or restart the instance, you need to run the following command to reconfigure code-server:

    sudo ./setup-codeserver.sh

    The commands should take a few seconds to run. You can close the terminal tab when you see the following.

    ml-10244-terminal-output

  3. Now reload the Jupyter page and check the New menu again.
    The Code Server option should now be available.

You can also launch code-server from JupyterLab using a dedicated launcher button, as shown in the following screenshot.

ml-10244-jupyterlab-code-server-button

Choosing Code Server will open a new browser tab, allowing you to access code-server from your browser. The Python and Docker extensions are already installed, and you can get to work in your ML project.

ml-10244-notebook-vscode

Automate the code-server install on a notebook instance

As an IT admin, you can automate the code-server install with a lifecycle configuration running on instance creation, and automate the setup with one running on instance start.

Here, we create an example notebook instance and lifecycle configuration using the AWS CLI. The on-create config runs install-codeserver, and on-start runs setup-codeserver.

From a terminal configured with the AWS CLI and appropriate permissions, run the following commands:

curl -LO https://github.com/aws-samples/amazon-sagemaker-codeserver/releases/download/v0.1.5/amazon-sagemaker-codeserver-0.1.5.tar.gz
tar -xvzf amazon-sagemaker-codeserver-0.1.5.tar.gz

cd amazon-sagemaker-codeserver/install-scripts/notebook-instances

aws sagemaker create-notebook-instance-lifecycle-config \
    --notebook-instance-lifecycle-config-name install-codeserver \
    --on-start Content=$((cat setup-codeserver.sh || echo "")| base64) \
    --on-create Content=$((cat install-codeserver.sh || echo "")| base64)

aws sagemaker create-notebook-instance \
    --notebook-instance-name <your_notebook_instance_name> \
    --instance-type <your_instance_type> \
    --role-arn <your_role_arn> \
    --lifecycle-config-name install-codeserver

# Make sure to replace <your_notebook_instance_name>, <your_instance_type>,
# and <your_role_arn> in the previous commands with the appropriate values.
Bash

The code-server install is now automated for the notebook instance.

Conclusion

With code-server hosted on SageMaker, ML teams can run VS Code on scalable cloud compute, code from anywhere, and speed up their ML project delivery. For IT admins, it allows them to standardize and expedite the provisioning of managed, secure IDEs in the cloud, to quickly onboard and enable ML teams in their projects.

In this post, we shared a solution you can use to quickly install code-server on both Studio and notebook instances. We shared a manual installation process that ML teams can run on their own, and an automated installation that IT admins can set up for them.

To go further in your learnings, visit AWSome SageMaker on GitHub to find all the relevant and up-to-date resources needed for working with SageMaker.


About the Authors

Giuseppe Angelo Porcelli is a Principal Machine Learning Specialist Solutions Architect for Amazon Web Services. With several years software engineering an ML background, he works with customers of any size to deeply understand their business and technical needs and design AI and Machine Learning solutions that make the best use of the AWS Cloud and the Amazon Machine Learning stack. He has worked on projects in different domains, including MLOps, Computer Vision, NLP, and involving a broad set of AWS services. In his free time, Giuseppe enjoys playing football.

Sofian Hamiti is an AI/ML specialist Solutions Architect at AWS. He helps customers across industries accelerate their AI/ML journey by helping them build and operationalize end-to-end machine learning solutions.

Eric Pena is a Senior Technical Product Manager in the AWS Artificial Intelligence Platforms team, working on Amazon SageMaker Interactive Machine Learning. He currently focuses on IDE integrations on SageMaker Studio . He holds an MBA degree from MIT Sloan and outside of work enjoys playing basketball and football.