Amazon SageMaker for IT Ops

Amazon SageMaker for IT Ops

ML Ops

Amazon SageMaker makes it easy for IT engineers to deploy ML models into production. You can create and automate workflows to support development of models in the thousands with scalable infrastructure and continuous integration and continuous delivery (CI/CD) pipelines. 

CI/CD for ML

Collect and prepare training data

Amazon SageMaker has everything you need to aggregate data from disparate data sources to make it ready for machine learning. 

Easily connect to data sources

Using Amazon SageMaker Data Wrangler, you can connect to data sources, such as Amazon Athena, Amazon Redshift, AWS Lake Formation, and Amazon S3, and easily import data in various file formats, such as CSV files, unstructured JSON files, and database tables directly into SageMaker. You can also easily create a data pipeline in just a few clicks.

Learn more »
SageMaker Data Wrangler

Secure

Amazon SageMaker allows you to operate on a fully secure ML environment on day one. You can use a comprehensive set of security features including infrastructure security, access control, data protection, and up-to-date compliance certifications across a broad range of industry verticals.

Learn more »
SageMaker Security

Build models

As machine learning proliferates across business units, Amazon SageMaker makes sure your infrastructure can scale to keep up with building hundreds and thousands of models. 

One-click Jupyter Notebooks

Amazon SageMaker Studio Notebooks are one-click Jupyter notebooks that can be spun up quickly. The underlying compute resources are fully elastic, so you can easily dial up or down the available resources and the changes take place automatically in the background without interrupting your work. Notebooks can be shared with a single click, your colleagues get the exact same notebook, saved in the same place. 

Get started »
Jupyter Notebook

Train and tune models

Amazon SageMaker helps you manage the exponential growth of training data, easily scaling to efficiently and cost effectively manage petabytes of data.

One-click training

When you’re ready to train in Amazon SageMaker, simply specify the location of your data in Amazon S3, indicate the type and quantity of SageMaker ML instances you need, and get started with a single click. SageMaker sets up a distributed compute cluster, performs the training, outputs the result to Amazon S3, and tears down the cluster when complete. 

Get started »
One-click Training

Managed spot training

Amazon SageMaker provides Managed Spot Training to help you to reduce training costs by up to 90%. This capability uses Amazon EC2 Spot instances, which is spare AWS compute capacity. Training jobs are automatically run when compute capacity becomes available and are made resilient to interruptions caused by changes in capacity, allowing you to save cost when you have flexibility with when to run training jobs.

Get started »
Managed Spot Training

Deploy models to production

Amazon SageMaker has all the tools you need to create workflows that scale and are secure.

Automated workflows

Amazon SageMaker Pipelines help you create, automate, and manage end-to-end ML workflows at scale using CI/CD practices. Once the workflows are created, they can be visualized and managed in SageMaker Studio. SageMaker Pipelines takes care of all the heavy lifting involved with managing dependencies between each step of the ML workflow. You can re-run complete workflows at any time with updated data to keep your models accurate, and share workflows with other teams to collaborate on projects. 

Learn more »
SageMaker Pipelines

Integration with Kubernetes

You can use the fully managed capabilities of Amazon SageMaker for machine learning, while continuing to use Kubernetes for orchestration and managing pipelines. SageMaker lets users train and deploy models using Kubernetes Operators for SageMaker. In addition, you can use Amazon SageMaker Components for Kubeflow Pipelines which enable you can take advantage of powerful SageMaker features such as data labeling, fully managed large-scale hyperparameter tuning and distributed training jobs, and one-click secure and scalable model deployment, without needing to configure and manage Kubernetes clusters specifically to run the machine learning jobs. 

Learn more »

Additional compute for inference

Amazon Elastic Inference allows you to attach just the right amount of GPU-powered inference acceleration to any Amazon SageMaker instance type with no code changes. You can choose the instance type that is best suited to the overall CPU and memory needs of your application, and then separately configure the amount of inference acceleration that you need to use resources efficiently and to reduce the cost of running inference. 

Get started »

One-click deployment

Amazon SageMaker makes it easy to deploy your trained model into production with a single click so that you can start generating predictions for real-time or batch data. You can one-click deploy your model onto auto-scaling Amazon ML instances across multiple availability zones for high redundancy. SageMaker will launch the instances, deploy your model, and set up the secure HTTPS endpoint for your application.

Get started »

Multi-model endpoints

Amazon SageMaker provides a scalable and cost effective way to deploy large numbers of custom machine learning models. SageMaker Multi-Model endpoints enable you to deploy multiple models with a single click on a single endpoint and serve them using a single serving container.

Get started »