Posted On: Mar 23, 2021
These instances deliver up to 30% higher throughput and up to 45% lower cost per inference than Amazon EC2 G4dn instances, which were already the lowest cost instance in the cloud for machine learning inference. Inf1 instances are ideal for applications such as image recognition, natural language processing, personalization and anomaly detection. Developers can manage their own machine learning application development platforms by either launching Inf1 instances with AWS Deep Learning AMIs, which include the Neuron SDK, or using Inf1 instances via Amazon Elastic Kubernetes Service (EKS) or Amazon Elastic Container Service (ECS) for containerized ML applications. EKS, ECS and SageMaker support for Inf1 instances in these new regions will be available soon.
Amazon EC2 Inf1 instances are available in 4 sizes, providing up to 16 Inferentia chips, 96 vCPUs, 192GB of memory, 100 Gbps of networking bandwidth and 19 Gbps of Elastic Block Store (EBS) bandwidth. These instances are purchasable On-Demand, as Reserved Instances, as Spot instances, or as part of Savings Plans and are now available in 21 regions globally, including US East (N. Virginia, Ohio), US West (Oregon, N. California), AWS GovCloud (US-East, US-West), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, Paris, Stockholm), Asia Pacific (Hong Kong, Mumbai, Seoul, Singapore, Sydney, Tokyo), Middle East (Bahrain), and South America (São Paulo).