Artificial Intelligence
Scalable multi-node training with TensorFlow
We’ve heard from customers that scaling TensorFlow training jobs to multiple nodes and GPUs successfully is hard. TensorFlow has distributed training built-in, but it can be difficult to use. Recently, we made optimizations to TensorFlow and Horovod to help AWS customers scale TensorFlow training jobs to multiple nodes and GPUs. With these improvements, any AWS customer […]
