Posted On: Jun 15, 2020
You can now use Amazon Elastic Kubernetes Service (EKS) to run containers on Amazon EC2 Inf1 Instances. With EKS and the AWS Neuron Kubernetes device plugin, it’s easy combine multiple Inferentia devices in your cluster to run high performance and cost-effective inference workloads at scale.
Amazon EC2 Inf1 instances deliver high performance and the lowest cost machine learning inference in the cloud. Inf1 instances feature up to 16 AWS Inferentia chips, high-performance machine learning inference chips designed and built by AWS. Using Inf1 instances, customers can run large scale machine learning inference applications like image recognition, speech recognition, natural language processing, personalization, and fraud detection. Once your machine learning model is trained to meet your requirements, you can deploy your model by using AWS Neuron, a specialized software development kit (SDK) consisting of a compiler, run-time, and profiling tools that optimizes the machine learning inference performance of Inferentia chips, and supports popular machine learning frameworks such as TensorFlow, PyTorch, or MXNet.
Amazon EKS has made it easy run Inferentia based containers by updating the EKS-Optimized Accelerated AMI with all the necessary AWS Neuron packages. After starting a cluster with worker nodes based on the latest Accelerated AMI, you can install the AWS Neuron Kubernetes device plugin, which advertises Inferentia devices as available resources to the worker node kubelet. This fine-grained scheduling capability allows EKS customers to achieve better utilization and greater cost savings compared to using standalone EC2 Inf1 instances.
EC2 Inf1 instances can be used on all EKS clusters running version 1.14 and above in regions where Inf1 is available. Today, only self managed node groups are supported, and can be started using eksctl, CloudFormation, or the AWS CLI. EKS managed node groups support will be added in a future release. To get started, visit the Amazon EKS documentation. To learn more about Inf1 instances and Inferentia, check out the Amazon EC2 documentation.