Posted On: Jun 25, 2021

Amazon EC2 Inf1 instances and AWS Neuron now support YOLOv5 and ResNext deep learning models as well as the latest open-source Hugging Face Transformers. We have also optimized the Neuron compiler to enhance performance and you can now achieve an out-of-the box 12X higher throughput than comparable GPU-based instances for pre-trained BERT base models. These enhancements enable you to effectively meet your high-performance inference requirements and deploy state of the art deep learning models at low cost. 

EC2 Inf1 instances are powered by AWS Inferentia, a custom chip built by AWS to accelerate machine learning inference. These instances deliver the lowest cost for deep learning inference in the cloud. You can easily train your machine learning models on popular machine learning frameworks such as TensorFlow, PyTorch, and MXNet, and deploy them on EC2 Inf1 instances using the Neuron SDK. As Neuron is integrated with popular machine learning frameworks, you can deploy your existing models to Inf1 instances with minimal code changes. This gives you the freedom to maintain hardware portability and take advantage of the latest technologies without being tied to a vendor-specific solution.

Inf1 instances have been broadly adopted by customers like Snap, Autodesk and Conde Nast and Amazon services such as Alexa and Rekognition and are available across 23 AWS Regions across the globe. Our engineering investments, coupled with our scale and our time-tested ability to manage our capacity, allow us to identify and pass on the cost savings to our customer. To help you further scale your deep learning applications in production on Amazon EC2 Inf1 instances, we are announcing a 38% reduction of our On-Demand (OD) prices effective June 1st, 2021. For customers who want to take advantage of Savings Plans or Reserved Instances (RI) to further lower their costs, we are reducing our 1 year Savings Plan & RI prices by 38% and our 3 year Savings Plan & RI prices by 31%. These lower prices would also be effective for customers who use EC2 Inf1 instances via container orchestration services such as Amazon ECS or EKS.

For customers who prefer to use a fully managed machine learning service, we are also reducing the price of ml.Inf1 instances in Amazon SageMaker. Amazon SageMaker is a fully managed service that provides every developer and data scientist with the ability to build, train, and deploy machine learning models. Effective June 1st, 2021, Amazon SageMaker customers can take advantage of 38% lower prices on On-Demand instances. Starting today, we are reducing prices on Amazon SageMaker 1 year Savings Plan by up to 38% and 3 year Savings Plan by up to 25%. These price reductions further increase the price to performance benefits of Inf1 instances for your real-time inference needs. For pricing of ml.Inf1 instances in Amazon SageMaker, please visit the Amazon SageMaker pricing page.

Amazon EC2 Inf1 instances are available in 23 regions including US East (N. Virginia, Ohio), US West (Oregon, N. California), AWS GovCloud (US-East, US-West), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, Paris, Stockholm), Asia Pacific (Hong Kong, Mumbai, Seoul, Singapore, Sydney, Tokyo), Middle East (Bahrain), South America (São Paulo) and China (Beijing, Ningxia). You can leverage Amazon EC2 Inf1 instances in the region that will best meet your real-time latency requirements for machine learning inference, now with further optimized performance and lower costs.

To learn more visit the Amazon EC2 Inf1 instance page.