Why Amazon EC2 P5 Instances?
Amazon Elastic Compute Cloud (Amazon EC2) P5 instances, powered by NVIDIA H100 Tensor Core GPUs, and P5e instances powered by NVIDIA H200 Tensor Core GPUs deliver the highest performance in Amazon EC2 for deep learning (DL) and high performance computing (HPC) applications. They help you accelerate your time to solution by up to 4x compared to previous-generation GPU-based EC2 instances, and reduce cost to train ML models by up to 40%. These instances help you iterate on your solutions at a faster pace and get to market more quickly. You can use P5 and P5e instances for training and deploying increasingly complex large language models (LLMs) and diffusion models powering the most demanding generative artificial intelligence (AI) applications. These applications include question answering, code generation, video and image generation, and speech recognition. You can also use these instances to deploy demanding HPC applications at scale for pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling.
To deliver these performance improvements and cost savings, P5 and P5e instances complement NVIDIA H100 and H200 Tensor Core GPUs with 2x higher CPU performance, 2x higher system memory, and 4x higher local storage as compared to previous-generation GPU-based instances. They provide market-leading scale-out capabilities for distributed training and tightly coupled HPC workloads with up to 3,200 Gbps of networking using second-generation Elastic Fabric Adapter (EFAv2). To deliver large-scale compute at low latency, P5 and P5e instances are deployed in Amazon EC2 UltraClusters that enable scaling up to 20,000 H100 or H200 GPUs. These are interconnected with a petabit-scale nonblocking network. P5 and P5e instances in EC2 UltraClusters can deliver up to 20 exaflops of aggregate compute capability—performance equivalent to a supercomputer.
Amazon EC2 P5 Instances
Benefits
Features
Customer testimonials
Here are some examples of how customers and partners have achieved their business goals with Amazon EC2 P4 instances.
-
Anthropic
Anthropic builds reliable, interpretable, and steerable AI systems that will have many opportunities to create value commercially and for public benefit.
-
Cohere
Cohere, a leading pioneer in language AI, empowers every developer and enterprise to build incredible products with world-leading natural language processing (NLP) technology while keeping their data private and secure
-
Hugging Face
Hugging Face is on a mission to democratize good ML.
Product details
Instance Size | vCPUs | Instance Memory (TiB) | GPU | GPU memory | Network Bandwidth (Gbps) | GPUDirect RDMA | GPU Peer to Peer | Instance Storage (TB) | EBS Bandwidth (Gbps) |
---|---|---|---|---|---|---|---|---|---|
p5.48xlarge | 192 | 2 | 8 H100 | 640 GB HBM3 |
3200 Gbps EFA | Yes | 900 GB/s NVSwitch | 8 x 3.84 NVMe SSD | 80 |
p5e.48xlarge | 192 | 2 | 8 H200 | 1128 GB HBM3e |
3200 Gbps EFA | Yes | 900 GB/s NVSwitch | 8 x 3.84 NVMe SSD | 80 |
Getting started with ML use cases
Getting started with HPC use cases
P5 instances are an ideal platform to run engineering simulations, computational finance, seismic analysis, molecular modeling, genomics, rendering, and other GPU-based HPC workloads. HPC applications often require high network performance, fast storage, large amounts of memory, high compute capabilities, or all of the above. P5 instances support EFAv2 that enables HPC applications using the Message Passing Interface (MPI) to scale to thousands of GPUs. AWS Batch and AWS ParallelCluster help HPC developers quickly build and scale distributed HPC applications.
Learn more