Posted On: Nov 20, 2018
AWS Batch now supports multi-node parallel jobs, which enables you to run single jobs that require multiple EC2 instances. Multi-node parallel jobs allow customers with tightly coupled, distributed computing workloads to take advantage of AWS Batch’s fully managed batch computing capabilities, avoiding the complexities of provisioning, managing, monitoring, and scaling their compute clusters, and reducing cost and operational overhead.
With multi-node parallel jobs support, developers, data scientists, and engineers can now easily and efficiently run workloads such as larger-scale tightly coupled High Performance Computing applications and distributed GPU model training. You can bring your own Docker container with preferred frameworks and libraries, such as Apache MXNet, TensorFlow, Caffe2, and Message Passing Interface (MPI). AWS Batch will handle job execution and compute resource management, allowing you to focus on analyzing results instead of setting up and managing infrastructure. Computational fluid dynamics, weather forecasting, climate simulation, deep learning model training, and structural stress analysis are examples of the types of workloads that can now run on AWS Batch.