Posted On: May 1, 2019
The Amazon EKS Deep Learning Benchmark Utility is a new automated tool for machine learning benchmarking on Kubernetes clusters. The tool is built and open sourced by the Amazon Elastic Container Service for Kubernetes (EKS) team.
Kubernetes is open source software that makes it easy to quickly scale machine learning models for training and inference and run them close to your data sources on AWS. With a wide range of variables and infrastructure choices to run machine learning jobs on Kubernetes, finding the right configuration for your workload requires ongoing benchmarking. Previously, benchmarking machine learning performance on Kubernetes required you to perform multiple manual steps for each performance optimization, adding significant time and work onto setting up cost-effective and performant machine learning jobs.
The Amazon EKS Deep Learning Benchmark Utility simplifies benchmarking the performance of your Kubernetes cluster running on AWS for deep learning training and other machine learning workloads. The utility provides an automated end-to-end benchmarking workflow from cluster creation to cluster tear-down, supports highly-configurable cluster configurations, different backend storage systems, and multiple frameworks including Tensorflow, Horovod, OpenMPI, PyTorch, and MxNet.
To learn more, visit the project on GitHub.
Read our blog to learn more about optimizing distributed deep learning performance with Amazon EKS.