AWS News Blog
Category: AWS Inferentia
Scaling Ad Verification with Machine Learning and AWS Inferentia
Amazon Advertising helps companies build their brand and connect with shoppers, through ads shown both within and beyond Amazon’s store, including websites, apps, and streaming TV content in more than 15 countries. Businesses or brands of all sizes including registered sellers, vendors, book vendors, Kindle Direct Publishing (KDP) authors, app developers, and agencies on Amazon […]
Read MoreMajority of Alexa Now Running on Faster, More Cost-Effective Amazon EC2 Inf1 Instances
Today, we are announcing that the Amazon Alexa team has migrated the vast majority of their GPU-based machine learning inference workloads to Amazon Elastic Compute Cloud (Amazon EC2) Inf1 instances, powered by AWS Inferentia. This resulted in 25% lower end-to-end latency, and 30% lower cost compared to GPU-based instances for Alexa’s text-to-speech workloads. The lower […]
Read MoreAmazon ECS Now Supports EC2 Inf1 Instances
As machine learning and deep learning models become more sophisticated, hardware acceleration is increasingly required to deliver fast predictions at high throughput. Today, we’re very happy to announce that AWS customers can now use the Amazon EC2 Inf1 instances on Amazon Elastic Container Service (Amazon ECS), for high performance and the lowest prediction cost in […]
Read MoreAmazon EKS Now Supports EC2 Inf1 Instances
Amazon Elastic Kubernetes Service (EKS) (EKS) has quickly become a leading choice for machine learning workloads. It combines the developer agility and the scalability of Kubernetes, with the wide selection of Amazon Elastic Compute Cloud (Amazon EC2) instance types available on AWS, such as the C5, P3, and G4 families. As models become more sophisticated, […]
Read More