Posted On: Dec 3, 2019
Amazon SageMaker Processing is a new capability of Amazon SageMaker for running pre- or post- processing and model evaluation workloads with a fully managed experience.
Data pre- or post-processing and model evaluation steps are an important part of the typical machine learning (ML) workflow. Typically, these tasks are run on separate infrastructure. Managing and scaling this infrastructure across multiple users is challenging and expensive. The use of various tools to achieve this involves considerable heavy lifting, leading to developers and data scientists spending significant time tuning the infrastructure for performance and scale.
Amazon SageMaker Processing lets customers run analytics jobs for data engineering and model evaluation on Amazon SageMaker easily and at scale. Combined with other critical ML tasks such as training and hosting, SageMaker Processing allows customers to enjoy the benefits of a fully managed environment with all the security and compliance guarantees built into Amazon SageMaker. With Amazon SageMaker Processing, customers have the flexibility of using the built-in data processing containers or bringing their own containers and submitting custom jobs to run on managed infrastructure. Once submitted, Amazon SageMaker launches the compute instances, processes and analyzes the input data and releases the resources upon completion.
Amazon SageMaker Processing is available in all AWS global regions where Amazon SageMaker is now available. Visit the documentation for more information and for sample notebooks. To learn how to use the feature visit the blog post.