Amazon SageMaker Announces Several Enhancements for Orchestration and Management

Posted on: Nov 28, 2018

Amazon SageMaker now supports new capabilities for better orchestration, experimentation, and collaboration for machine learning (ML) workflows. AWS Step Functions is now integrated with Amazon SageMaker and AWS Glue, making it easier to build, deploy, monitor, and iterate ML workflows. Using AWS Step Functions, you can now automate ML workflows by connecting multiple Amazon SageMaker jobs in a few minutes, and with less code. We now have a new capability to help you organize, track, and evaluate your ML training experiments with Amazon SageMaker Search, that is available in beta starting today. Lastly, you can now associate GitHUb, AWS CodeCommit, and any self-hosted Git repository with Amazon SageMaker notebook instances to easily and securely collaborate and ensure version-control with Jupyter notebooks. Visit the AWS Step Functions documentation for more details.

Typically, automating ML workflows involves writing and maintaining code to define the workflow logic, monitor the completion of each job, and address any errors. ML models need to be managed for large datasets before deploying them into production environments. Re-deployment is required every time a model changes, and multiple teams are required to ensure that the model is performing as expected. This entire process is complex and can slow down delivery of applications. With the integration of AWS Step Functions and Amazon SageMaker, you can automate the publishing of large, diverse datasets into an Amazon S3 data lake, train ML models, and deploy these models into production. AWS Step Functions can sequence and run jobs in parallel, and automatically retry any failed jobs. The integration includes built-in error handling, passing of parameters, and state management. This enables acceleration for the delivery of secure, resilient ML applications, while reducing the amount of code that you have to write and maintain. 

Developing a successful ML model requires continuous experimentation, trying new algorithms and hyperparameters, all the while observing any impact to performance and accuracy. This makes it difficult to track the unique combination of datasets, algorithms, and parameters to achieve the winning model. You can now organize, track, and evaluate your machine learning model training experiments with Amazon SageMaker Search. SageMaker Search helps you quickly find and evaluate the most relevant model training runs from the potentially thousands of model training runs, right from the AWS management console and through the AWS SDK for Amazon SageMaker. Search is available in beta in 13 AWS regions where Amazon SageMaker is currently available. See the blog here for more information.

It is often required to share ideas, tasks, and collaborate to make progress with machine learning. The de-facto standard for collaboration with traditional software development has been version control, which plays an important role in machine learning as well. It is now possible to associate GitHub, AWS Code Commit, and any self-hosted Git repository with Amazon SageMaker notebook instances to easily and securely collaborate and ensure version-control with Jupyter notebooks. Using Git repositories with Jupyter notebooks, it is easy to co-author project, track code changes, and combine software engineering and data science practices for production-ready code management. You can easily discover, execute, and share machine learning and deep learning techniques that are provided on Jupyter notebooks and hosted in GitHub. See the blog here for more information. 

Typically, automating ML workflows involves writing and maintaining code to define the workflow logic, monitor the completion of each job, and address any errors. ML models need to be managed for large datasets before deploying them into production environments. Re-deployment is required every time a model changes, and multiple teams are required to ensure that the model is performing as expected. This entire process is complex and can slow down delivery of applications. With the integration of AWS Step Functions and Amazon SageMaker, you can automate the publishing of large, diverse datasets into an Amazon S3 data lake, train ML models, and deploy these models into production. AWS Step Functions can sequence and run jobs in parallel, and automatically retry any failed jobs. The integration includes built-in error handling, passing of parameters, and state management. This enables acceleration for the delivery of secure, resilient ML applications, while reducing the amount of code that you have to write and maintain. 

Developing a successful ML model requires continuous experimentation, trying new algorithms and hyperparameters, all the while observing any impact to performance and accuracy. This makes it difficult to track the unique combination of datasets, algorithms, and parameters to achieve the winning model. You can now organize, track, and evaluate your machine learning model training experiments with Amazon SageMaker Search. SageMaker Search helps you quickly find and evaluate the most relevant model training runs from the potentially thousands of model training runs, right from the AWS management console and through the AWS SDK for Amazon SageMaker. Search is available in beta in 13 AWS regions where Amazon SageMaker is currently available. See the blog here for more information.

It is often required to share ideas, tasks, and collaborate to make progress with machine learning. The de-facto standard for collaboration with traditional software development has been version control, which plays an important role in machine learning as well. It is now possible to associate GitHub, AWS Code Commit, and any self-hosted Git repository with Amazon SageMaker notebook instances to easily and securely collaborate and ensure version-control with Jupyter notebooks. Using Git repositories with Jupyter notebooks, it is easy to co-author project, track code changes, and combine software engineering and data science practices for production-ready code management. You can easily discover, execute, and share machine learning and deep learning techniques that are provided on Jupyter notebooks and hosted in GitHub. See the blog here for more information.

Typically, automating ML workflows involves writing and maintaining code to define the workflow logic, monitor the completion of each job, and address any errors. ML models need to be managed for large datasets before deploying them into production environments. Re-deployment is required every time a model changes, and multiple teams are required to ensure that the model is performing as expected. This entire process is complex and can slow down delivery of applications. With the integration of AWS Step Functions and Amazon SageMaker, you can automate the publishing of large, diverse datasets into an Amazon S3 data lake, train ML models, and deploy these models into production. AWS Step Functions can sequence and run jobs in parallel, and automatically retry any failed jobs. The integration includes built-in error handling, passing of parameters, and state management. This enables acceleration for the delivery of secure, resilient ML applications, while reducing the amount of code that you have to write and maintain. 

Developing a successful ML model requires continuous experimentation, trying new algorithms and hyperparameters, all the while observing any impact to performance and accuracy. This makes it difficult to track the unique combination of datasets, algorithms, and parameters to achieve the winning model. You can now organize, track, and evaluate your machine learning model training experiments with Amazon SageMaker Search. SageMaker Search helps you quickly find and evaluate the most relevant model training runs from the potentially thousands of model training runs, right from the AWS management console and through the AWS SDK for Amazon SageMaker. Search is available in beta in 13 AWS regions where Amazon SageMaker is currently available. See the blog here for more information.

It is often required to share ideas, tasks, and collaborate to make progress with machine learning. The de-facto standard for collaboration with traditional software development has been version control, which plays an important role in machine learning as well. It is now possible to associate GitHub, AWS Code Commit, and any self-hosted Git repository with Amazon SageMaker notebook instances to easily and securely collaborate and ensure version-control with Jupyter notebooks. Using Git repositories with Jupyter notebooks, it is easy to co-author project, track code changes, and combine software engineering and data science practices for production-ready code management. You can easily discover, execute, and share machine learning and deep learning techniques that are provided on Jupyter notebooks and hosted in GitHub. See the blog here for more information.