Amazon SageMaker Studio
The fully integrated development environment (IDE) for machine learning
Perform all ML development steps, from preparing raw data to deploying and monitoring ML models, with access to the most comprehensive set of tools in a single web-based visual interface.
Quickly move between steps of the ML lifecycle to fine-tune your models. Replay training experiments, tune model features and other inputs, and compare results, without leaving SageMaker Studio.
Build ML models in minutes with access to over 150 popular open-source models and over 15 prebuilt solutions. Create ML models with your own data with just a few clicks.
Amazon SageMaker Studio is an integrated development environment (IDE) that provides a single web-based visual interface where you can access purpose-built tools to perform all machine learning (ML) development steps, from preparing data to building, training, and deploying your ML models, improving data science team productivity by up to 10x. You can quickly upload data, create new notebooks, train and tune models, move back and forth between steps to adjust experiments, collaborate seamlessly within your organization, and deploy models to production without leaving SageMaker Studio.
How it works
Prepare data in a few clicks, using little to no code
Connect to over 40 AWS and third-party data sources, import data, verify data quality, engineer model features using 300+ built-in data transformations, and save them to SageMaker Feature Store in a few clicks using SageMaker Data Wrangler. You can create or schedule Data Wrangler jobs to process data at scale, and automate data preparation steps in the ML workflow using SageMaker Pipelines.
Prepare data using SageMaker Studio notebooks
Simplify your data workflows with a unified notebook environment for data engineering, analytics, and ML. Create, browse, and connect to Amazon EMR clusters and AWS Glue Interactive Sessions directly from Studio notebooks. Monitor and debug Spark jobs using familiar tools such as Spark UI right from the notebooks. Use the built-in data preparation capability powered by SageMaker Data Wrangler directly from Studio notebooks to visualize data, identify data quality issues, and apply recommended solutions to improve data quality and model accuracy without writing a single line of code.
Data processing in a few clicks
Connect to data stores, spin up the resources to run your data processing job, save the output to persistent storage, and provide logs and metrics using SageMaker Processing.
A central Feature Store
Store, share, and manage ML model features for training and inference to promote feature reuse across ML applications using SageMaker Feature Store, a fully managed, purpose-built repository in SageMaker Studio. You get the same features consistently both during training and during inference, saving months of development effort.
Quick start SageMaker Studio notebooks
Access fully managed Jupyter notebooks in SageMaker Studio in one click. SageMaker Studio notebooks come preconfigured with deep learning environments for TensorFlow and PyTorch (optimized by AWS) to help you quickly get started with model building. You can dial up or down the underlying compute resources without interrupting your work.
Streamlined notebook collaboration
Coedit the same notebook file, run notebook code simultaneously, and review the results together to streamline collaboration. All resources are automatically tagged, making it easier to monitor cost and usage of SageMaker Studio.
Use over 15 built-in algorithms available in prebuilt container images to quickly train and run inference, or bring your own custom images to SageMaker Studio.
Automatically build, train, and tune the best ML models based on your data while maintaining full control and visibility using SageMaker Autopilot. Then directly deploy models to production with just one click. You can also automatically generate a SageMaker Studio notebook for any model SageMaker Autopilot creates and dive into the details of how it was created, refine it as desired, and recreate it from the notebook.
Prebuilt solutions and open-source models
Quickly get started with ML using hundreds of prebuilt solutions that can be deployed with just a few clicks using SageMaker JumpStart.
Set up a distributed compute cluster, perform the training, output results to Amazon Simple Storage Service (S3), and tear down the cluster in a single click. Train models at scale using SageMaker data parallel and model parallel libraries, and accelerate training processes by up to 50% through graph- and kernel-level optimizations by using SageMaker Training Compiler. You can reduce costs by up to 90% by using managed spot instance training.
Experiment management and tracking
Track iterations to ML models by capturing the input parameters, configurations, and results, and storing them as experiments using SageMaker Experiments. You can browse active experiments, search and review previous experiments, and compare results across experiments.
Automatic model tuning
Automatically tune your model by adjusting thousands of combinations of algorithm parameters to arrive at the most accurate predictions the model is capable of producing, saving weeks of effort.
Debug and profile training runs
Capture metrics and profile training jobs in real time so you can correct performance problems quickly, before the model is deployed to production, using SageMaker Debugger.
Deploy and Manage
Deploy your trained model into production with a single click. Access SageMaker Model Deployment within SageMaker Studio for all your inference needs, from low latency (a few milliseconds) and high throughput (hundreds of thousands of requests per second) to long-running inference for use cases such as natural language processing and computer vision.
Deploy thousands of models on a single endpoint using SageMaker’s multi-model endpoints and multi-container endpoints, improving cost-effectiveness while providing the flexibility to use models as often as you need them.
Centrally track and manage model versions
Track model versions, their metadata, and performance using SageMaker Model Registry, making it easier to choose the right model for deployment based on your business requirements. In addition, you can automatically log approval workflows for audit and compliance.
Rapidly deliver new models for production applications
Bring continuous integration and delivery (CI/CD) practices to ML using SageMaker Projects, such as maintaining parity between development and production environments, source and version control, A/B testing, and automation.
Ongoing model monitoring
Maintain quality by detecting model drift and concept drift in real time using SageMaker Model Monitor within SageMaker Studio. All models trained in SageMaker automatically emit key metrics that can be collected and viewed in SageMaker Studio.
Automatic conversion of notebook code to production-ready jobs
Once a notebook is selected, Amazon SageMaker Studio notebook takes a snapshot of the entire notebook, packages its dependencies in a container, builds the infrastructure, runs the notebook as an automated job on a schedule set by the practitioner, and deprovisions the infrastructure upon job completion–reducing the time it takes to move a notebook to production from weeks to hours.
Automate model building workflows
Automate the entire model build workflow, including data preparation, feature engineering, model training, model tuning, and model validation using SageMaker Pipelines. You can configure SageMaker Pipelines to run automatically at regular intervals or when certain events occur, or run them manually as needed.
Detect bias in ML models
Detect and limit potential bias during data preparation, after model training, and in your deployed model by examining attributes you specify using SageMaker Clarify. SageMaker Clarify also provides model explainability reports, so stakeholders can see how and why models make predictions.
With SageMaker Studio, AstraZeneca was able to rapidly deploy a solution to analyze large amounts of data, accelerating insights while reducing the manual workload of its data scientists—crucial to AstraZeneca’s mission of discovering and developing life-changing medicines for people around the world.
“Rather than creating many manual processes, we can automate most of the ML development process simply within Amazon SageMaker Studio.”
Cherry Cabading, Global Senior Enterprise Architect – AstraZeneca
INVISTA used Amazon SageMaker Experiments within Studio for model tracking. With an easy interface to manage experiments, get a broader scope of projects, and add new models, metrics, and performance in a structured way, INVISTA accelerated data science value.
"With Amazon SageMaker Studio, we’re now able to co-locate data science tasks. This allows us to save time managing infrastructure and repositories and helps us reduce the time to deploy algorithms and analytics projects into production.”
Tanner Gonzalez, Analytics and Cloud Leader – INVISTA
With SageMaker Studio and Experiments, SyntheticGestalt can determine the best experiment settings 2x faster, which ultimately accelerates the ability to produce life-changing candidate molecules.
“SageMaker helps our researchers easily compare thousands of experiment settings; they are able to do with a single step what previously consumed hours of our researchers’ time."
Kotaro Kamiya, CTO – SyntheticGestalt Ltd.
Onboard quickly to SageMaker Studio
SageMaker Studio technical deep dive
Using Apache Spark on Amazon EMR with SageMaker
Stay up to date with SageMaker Studio announcements
SageMaker Studio blog series
Scale data preparation using Amazon SageMaker Studio notebooks
Build ML models using Amazon SageMaker Studio notebooks
SageMaker Studio administration best practices
Follow this step-by-step tutorial to build and train a machine learning (ML) model locally within Amazon SageMaker Studio.
In this hands-on lab, learn how to use Amazon SageMaker to build, train, and deploy an ML model.
Get started building with Amazon SageMaker in the AWS Management Console.