Announcing Enhancements for Data Processing and Feature Engineering, and Improved Framework Support with Amazon SageMaker

Posted On: Nov 28, 2018

Amazon Sagemaker now supports deploying Inference Pipelines so you can pass raw input data and execute pre-processing, predictions, and post-processing on real-time and batch inference requests. SageMaker also supports two new machine learning frameworks: Scikit-learn and Spark ML. This makes it easy to build and deploy feature preprocessing pipelines with a suite of feature transformers available in the new SparkML and scikit-learn framework containers in Amazon SageMaker. These new capabilities also enable you to write SparkML and Scikit-learn code once and reuse it for training and inference which provides consistency in pre-processing steps and easier management of your machine learning processes.

Typically, a lot of time is spent cleaning and preparing data before training machine learning models. The same steps need to be applied during inference as well. Previously, the input data for inference requests required data processing and feature engineering steps to be executed in the client application before being sent to Amazon SageMaker for predictions, or to be included in the inference container. With the new Inference Pipelines, you can bundle and export your pre-processing and post-processing steps used in training and deploy them as part of an Inference Pipeline. Inference Pipelines can be comprised of any machine learning framework, built-in algorithm, or custom containers usable on Amazon SageMaker.

All these enhancements are available in all AWS regions where Amazon SageMaker is available today. Visit the documentation for additional information.

Announcing Enhancements for Data Processing and Feature Engineering, and Improved Framework Support with Amazon SageMaker

Ending Support for Internet Explorer