AWS Machine Learning Infrastructure
Benefits

High Performance
AWS offers the highest performing ML compute infrastructure in the cloud. For training, Amazon EC2 P4 instances offer 2.5x improved performance compared to previous generation instances and the fastest networking, up to 400 Gbps. For inference, Amazon EC2 Inf1 instances deliver up to 2.3x higher throughput compared to current generation GPU-based instances.

Optimized for Machine Learning
AWS computing instances support major machine learning frameworks such as TensorFlow and PyTorch. They also support models and toolkits such as Hugging Face for a broad range of machine learning use cases. The AWS Deep Learning AMIs and Deep Learning Containers come pre-installed with optimizations for ML frameworks and toolkits to accelerate deep learning in the cloud.

Easy to Use
Amazon SageMaker, a fully managed ML service, is the fastest and easiest way to get started with AWS infrastructure and also offers purpose-built tools including data labeling, data preparation, feature engineering, statistical bias detection, AutoML, training, tuning, hosting, explainability, monitoring, and workflows. SageMaker is built on decades of Amazon ML experience.

Scale
AWS customers have access to virtually unlimited compute, network, and storage so they can scale. You can scale up or down as needed from one GPU to thousands, and you can scale up or down as needed from terabytes to petabytes of storage. Using the cloud, you don’t need to invest in all possible infrastructure. Instead, take advantage of elastic compute, storage, and networking.

Cost Effective
With a broad choice of infrastructure services, you can choose the right infrastructure for your budget. Choose from any CPU, GPU, or accelerator-based instance and only pay for what you use so you never pay for idle capacity. Amazon EC2 Inf1 instances powered by AWS Inferentia deliver up to 70% lower cost per inference than comparable current generation GPU-based instances.
What we offer

Customers

Amazon Alexa’s AI and ML-based intelligence is available on more than 100 million devices today. Alexa is always becoming smarter, more conversational, more proactive, and even more delightful. Alexa uses Amazon EC2 Inf1 to lower inference latency and cost-per-inference on Alexa text-to-speech.

Autodesk is advancing cognitive technology with an AI-powered virtual assistant, Autodesk Virtual Agent (AVA). AVA answers over 100,000 customer questions per month by applying natural language understanding (NLU) and deep learning techniques to extract the context, intent, and meaning behind inquiries. Piloting AWS Inferentia, they were able to obtain a 4.9x higher throughput than GPU-based instances.

Rad AI uses AI to automate radiology workflows and help streamline radiology reporting. With the new Amazon EC2 P4d instances, Rad AI sees faster inference and the ability to train models 2.4x faster and with higher accuracy.