Accelerated computing Amazon EC2 instance types
Boost function operations with faster hardware
What are accelerated computing EC2 instance types?
Accelerated computing instances use hardware accelerators, or co-processors, to perform functions more efficiently. For example, they can perform floating point number calculations, graphics processing, or data pattern matching.

Instance categories
Choose from a range of EC2 instance types, each offering unique combinations of compute, memory, and storage to power your specific workload needs.
Explore instance types
P6e - Instance
Instance Type
|
GPUs
|
vCPUs
|
Instance Memory (GiB)
|
GPU Memory (GB)
|
Network Bandwidth
|
GPUDirect RDMA
|
GPU Peer to Peer
|
Instance Storage (TB)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|---|---|
P6e-gb200.36xlarge*
|
4 |
144 |
960 |
740 |
1600 |
Yes |
1800 |
22.5 |
60 |
*Single instance specifications are provided for information only.
P6e-GB200 instances are only available in UltraServers ranging in size
from 36 to 72 GPUs.
Amazon EC2 P6e-GB200 UltraServers accelerated by NVIDIA GB200 NVL72 offer the highest GPU AI training and inference performance in Amazon Elastic Compute Cloud (Amazon EC2).
Features:
- Grace Blackwell Superchips are powered by ARM-based Grace CPUs and up to 72 Blackwell GPUs within one NVLink domain to deliver up to 360 petaflops of FP8 compute (without sparsity)
- Up 13.4 TB of high-bandwidth memory (HBM3e) GPU memory
- Up to 28.8 terabits per second of network bandwidth with support for Elastic Fabric Adapter (EFAv4) and NVIDIA GPUDirect Remote Direct Memory Access (RDMA)
- 1800 GB/s peer-to-peer GPU communication with NVIDIA NVSwitch
Use Cases
- P6e-GB200 UltraServers accelerate both the training and inference of frontier models, including mixture of expert models and reasoning models, at the trillion-parameter scale.
- Agentic and generative AI applications, including question answering, code generation, video and image generation, speech recognition, and more.
P6e - UltraServers
Instance Type
|
GPUs
|
vCPUs
|
Instance Memory (GiB)
|
GPU Memory (GB)
|
Network Bandwidth
|
GPUDirect RDMA
|
GPU Peer to Peer
|
Instance Storage (TB)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|---|---|
u-p6e-gb200x36
|
36 |
1296 |
8640 |
6660 |
14400 |
Yes |
1800 |
202.5 |
540 |
u-p6e-gb200x72
|
72 |
2592 |
17280 |
13320 |
28800 |
Yes |
1800 |
405 |
1080 |
P6e-GB200 instances have the following specs:
P6
Instance
|
GPUs
|
vCPUs
|
Instance Memory (TiB)
|
GPU Memory (GB)
|
Network Bandwidth (Gbps)
|
GPUDirect RDMA
|
GPU Peer to Peer
|
Instance Storage (TB)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|---|---|
P6-b200.48xlarge
|
8 |
192 |
2 |
1432 |
8 x 400 |
Yes |
1800 |
8 x 3.84 |
100 |
Amazon EC2 P6-B200 instances, accelerated by NVIDIA Blackwell GPUs, offer up to 2x performance compared to P5en instances for AI training and inference.
Features:
- 5th Generation Intel Xeon Scalable processors (Emerald Rapids)
- 8 NVIDIA Blackwell GPUs
- Up to 1440 GB of HBM3e GPU memory
- Up to 3.2 terabits per second network bandwidth with support for Elastic Fabric Adapter (EFAv4) and NVIDIA GPUDirect “Remote Direct Memory Access” (RDMA)
- 1800 GB/s peer-to-peer GPU communication with NVIDIA NVSwitch
P6-B200 instances have the following specs:
Use Cases
- P6-B200 instances are a cost-effective option to train and deploy medium-to-large frontier foundation models such as mixture of experts and reasoning models with high performance.
- Agentic and generative AI applications, including question answering, code generation, video and image generation, speech recognition, and more
- HPC applications at scale in pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling
P5
Instance
|
GPUs
|
vCPUs
|
Instance Memory (TiB)
|
GPU Memory
|
Network Bandwidth
|
GPUDirect RDMA
|
GPU Peer to Peer
|
Instance Storage (TB)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|---|---|
p5.4xlarge
|
1 H100 |
16 |
256 GiB |
80 GB HBM3 |
100 Gbps EFA |
No* |
N/A* |
3.84 NVMe SSD |
10 |
p5.48xlarge
|
8 H100 |
192 |
2 |
640 GB HBM3 |
3200 Gbps EFAv2 |
Yes |
900 GB/s NVSwitch |
8 x 3.84 NVMe SSD |
80 |
p5e.48xlarge
|
8 H200
|
192
|
2
|
1128 GB HBM3
|
3200 Gbps EFAv2
|
Yes
|
900 GB/s NVSwitch
|
8x 3.84 NVMe SSD
|
80
|
p5en.48xlarge
|
8 H200
|
192
|
2
|
1128 GB HBM3
|
3200 Gbps EFAv3
|
Yes
|
900 GB/s NVSwitch
|
8x 3.84 NVMe SSD
|
100
|
*GPUDirect RDMA is not supported in P5.4xlarge
Amazon EC2 P5 instances are GPU-based instances and highest performance in Amazon EC2 for deep learning and high performance computing (HPC).
Features:
- Intel Sapphire Rapids CPU and PCIe Gen5 between the CPU and GPU in P5en instances; 3rd Gen AMD EPYC processors (AMD EPYC 7R13) and PCIe Gen4 between the CPU and GPU in P5 and P5e instances.
- Up to 8 NVIDIA H100 (in P5) or H200 (in P5e and P5en) Tensor Core GPUs
- Up to 3,200 Gbps network bandwidth with support for Elastic Fabric Adapter (EFA) and NVIDIA GPUDirect RDMA (remote direct memory access)
- 900 GB/s peer-to-peer GPU communication with NVIDIA NVSwitch
P5 instances have the following specs:
Use Cases
Generative AI applications, including question answering, code generation, video and image generation, speech recognition, and more.
HPC applications at scale in pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling.
P4
Instance
|
GPUs
|
vCPUs
|
Instance Memory (GiB)
|
GPU Memory
|
Network Bandwidth
|
GPUDirect RDMA
|
GPU Peer to Peer
|
Instance Storage (GB)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|---|---|
p4d.24xlarge
|
8
|
96
|
1152
|
320 GB HBM2
|
400 ENA and EFA
|
Yes
|
600 GB/s NVSwitch
|
8 x 1000 NVMe SSD
|
19
|
p4de.24xlarge
|
8
|
96
|
1152
|
640 GB HBM2e
|
400 ENA and EFA
|
Yes
|
600 GB/s NVSwitch
|
8 x 1000 NVMe SSD
|
19
|
Amazon EC2 P4 instances provide high performance for machine learning training and high performance computing in the cloud.
- 3.0 GHz 2nd Generation Intel Xeon Scalable processors (Cascade Lake P-8275CL)
- Up to 8 NVIDIA A100 Tensor Core GPUs
- 400 Gbps instance networking with support for Elastic Fabric Adapter (EFA) and NVIDIA GPUDirect RDMA (remote direct memory access)
- 600 GB/s peer-to-peer GPU communication with NVIDIA NVSwitch
- Deployed in Amazon EC2 UltraClusters consisting of more than 4,000 NVIDIA A100 Tensor Core GPUs, petabit-scale networking, and scalable low-latency storage with Amazon FSx for Lustre
P4d instances have the following specs:
- 3.0 GHz 2nd Generation Intel Xeon Scalable processors
- Intel AVX, Intel AVX2, Intel AVX-512, and Intel Turbo
- EBS Optimized
- Enhanced Networking†
- Elastic Fabric Adapter (EFA)
Use Cases
Machine learning, high performance computing, computational fluid dynamics, computational finance, seismic analysis, speech recognition, autonomous vehicles, and drug discovery.
G6e
Instance Name
|
vCPUs
|
Memory (GiB)
|
NVIDIA L40S Tensor Core GPU
|
GPU Memory (GB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|
g6e.xlarge
|
4
|
32
|
1
|
48
|
Up to 20
|
Up to 5
|
g6e.2xlarge
|
8
|
64
|
1
|
48
|
Up to 20
|
Up to 5
|
g6e.4xlarge
|
16
|
128
|
1
|
48
|
20
|
8
|
g6e.8xlarge
|
32
|
256
|
1
|
48
|
25
|
16
|
g6e.16xlarge
|
64
|
512
|
1
|
48
|
35
|
20
|
g6e.12xlarge
|
48
|
384
|
4
|
192
|
100
|
20
|
g6e.24xlarge
|
96
|
768
|
4
|
192
|
200
|
30
|
g6e.48xlarge
|
192
|
1536
|
8
|
384
|
400
|
60
|
Amazon EC2 G6e instances are designed to accelerate deep learning inference and spatial computing workloads.
Features:
- 3rd generation AMD EPYC processors (AMD EPYC 7R13)
- Up to 8 NVIDIA L40S Tensor Core GPUs
- Up to 400 Gbps of network bandwidth
- Up to 7.6 TB of local NVMe local storage
Use Cases
Inference workloads for large language models and diffusion models for image, audio, and video, generation; single-node training of moderately complex generative AI models; 3D simulations, digital twins, and industrial digitization.
G6 - Fractional-GPU Gr6 instances with 1:8 vCPU:RAM ratio
Instance Name
|
vCPUs
|
Memory (GiB)
|
NVIDIA L4 Tensor Core GPU
|
GPU Memory (GiB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|
gr6f.4xlarge
|
16 |
128 |
1/2 |
12 |
Up to 25 |
8 |
G6 - Fractional-GPU G6 instances
Instance Name
|
vCPUs
|
Memory (GiB)
|
NVIDIA L4 Tensor Core GPU
|
GPU Memory (GiB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|
g6f.large
|
2 |
8 |
1/8 |
3 |
Up to 10
|
Up to 5
|
g6f.xlarge
|
4 |
16 |
1/8 |
3 |
Up to 10
|
Up to 5
|
g6f.2xlarge
|
8 |
32 |
1/4 |
6 |
Up to 10 |
8Up tp 5 |
g6f.4xlarge
|
16 |
64 |
1/2 |
12 |
Up to 25 |
6 |
Amazon EC2 G6 instances are designed to accelerate graphics-intensive applications and machine learning inference.
Features:
- 3rd generation AMD EPYC processors (AMD EPYC 7R13)
- Up to 8 NVIDIA L4 Tensor Core GPUs
- Up to 100 Gbps of network bandwidth
- Up to 7.52 TB of local NVMe local storage
Use Cases
Deploying ML models for natural language processing, language translation, video and image analysis, speech recognition, and personalization as well as graphics workloads, such as creating and rendering real-time, cinematic-quality graphics and game streaming.
G6 - Single-GPU G6 instances
Instance Name
|
vCPUs
|
Memory (GiB)
|
NVIDIA L4 Tensor Core GPU
|
GPU Memory (GiB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|
g6.xlarge
|
4
|
16
|
1
|
24
|
Up to 10
|
Up to 5
|
g6.2xlarge
|
8
|
32
|
1
|
24
|
Up to 10
|
Up to 5
|
g6.4xlarge
|
16
|
64
|
1
|
24
|
Up to 25
|
8
|
g6.8xlarge
|
32
|
128
|
1
|
24
|
25
|
16
|
g6.16xlarge
|
64
|
256
|
1
|
24
|
25
|
20
|
G6 - Single-GPU Gr6 instances with 1:8 vCPU:RAM ratio
Instance Name
|
vCPUs
|
Memory (GiB)
|
NVIDIA L4 Tensor Core GPU
|
GPU Memory (GiB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|
gr6.4xlarge
|
16 |
128 |
1
|
24
|
Up to 25 |
8 |
gr6.8xlarge
|
32 |
256 |
1
|
24
|
25 |
16 |
G6 - Multi-GPU G6 instances
Instance Name
|
vCPUs
|
Memory (GiB)
|
NVIDIA L4 Tensor Core GPU
|
GPU Memory (GiB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|
g6.12xlarge
|
48
|
192
|
4
|
96
|
40
|
20
|
g6.24xlarge
|
96
|
384
|
4
|
96
|
50
|
30
|
g6.48xlarge
|
192 |
768 |
8 |
192 |
100 |
60 |
Gr6 instances with 1:8 vCPU:RAM ratio
Instance Name
|
vCPUs
|
Memory (GiB)
|
NVIDIA L4 Tensor Core GPU
|
GPU Memory (GiB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|
gr6.4xlarge
|
16
|
128
|
1
|
24
|
Up to 25
|
8
|
gr6.8xlarge
|
32
|
256
|
1
|
24
|
25
|
16
|
Amazon EC2 G6 instances are designed to accelerate graphics-intensive applications and machine learning inference.
Features:
- 3rd generation AMD EPYC processors (AMD EPYC 7R13)
- Up to 8 NVIDIA L4 Tensor Core GPUs
- Up to 100 Gbps of network bandwidth
- Up to 7.52 TB of local NVMe local storage
Use Cases
Deploying ML models for natural language processing, language translation, video and image analysis, speech recognition, and personalization as well as graphics workloads, such as creating and rendering real-time, cinematic-quality graphics and game streaming.
G5g
Instance Name
|
vCPUs
|
Memory (GiB)
|
NVIDIA T4G Tensor Core GPU
|
GPU Memory (GiB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|
g5g.xlarge
|
4
|
8
|
1
|
16
|
Up to 10
|
Up to 3.5
|
g5g.2xlarge
|
8
|
16
|
1
|
16
|
Up to 10
|
Up to 3.5
|
g5g.4xlarge
|
16
|
32
|
1
|
16
|
Up to 10
|
Up to 3.5
|
g5g.8xlarge
|
32
|
64
|
1
|
16
|
12
|
9
|
g5g.16xlarge
|
64
|
128
|
2
|
32
|
25
|
19
|
g5g.metal
|
64
|
128
|
2
|
32
|
25
|
19
|
Amazon EC2 G5g instances are powered by AWS Graviton2 processors and feature NVIDIA T4G Tensor Core GPUs to provide the best price performance in Amazon EC2 for graphics workloads such as Android game streaming. They are the first Arm-based instances in a major cloud to feature GPU acceleration. Customers can also use G5g instances for cost-effective ML inference.
Features:
- Custom built AWS Graviton2 Processor with 64-bit Arm Neoverse cores
- Up to 2 NVIDIA T4G Tensor Core GPUs
- Up to 25 Gbps of networking bandwidth
- EBS-optimized by default
- Powered by the AWS Nitro System, a combination of dedicated hardware and lightweight hypervisor
Use Cases
Android game streaming, machine learning inference, graphics rendering, autonomous vehicle simulations
G5
Instance Size
|
GPU
|
GPU Memory (GiB)
|
vCPUs
|
Memory (GiB)
|
Instance Storage (GB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|
g5.xlarge
|
1
|
24
|
4
|
16
|
1 x 250 NVMe SSD
|
Up to 10
|
Up to 3.5
|
g5.2xlarge
|
1
|
24
|
8
|
32
|
1 x 450 NVMe SSD
|
Up to 10
|
Up to 3.5
|
g5.4xlarge
|
1
|
24
|
16
|
64
|
1 x 600 NVMe SSD
|
Up to 25
|
8
|
g5.8xlarge
|
1
|
24
|
32
|
128
|
1 x 900 NVMe SSD
|
25
|
16
|
g5.16xlarge
|
1
|
24
|
64
|
256
|
1 x 1900 NVMe SSD
|
25
|
16
|
g5.12xlarge
|
4
|
96
|
48
|
192
|
1 x 3800 NVMe SSD
|
40
|
16
|
g5.24xlarge
|
4
|
96
|
96
|
384
|
1 x 3800 NVMe SSD
|
50
|
19
|
g5.48xlarge
|
8
|
192
|
192
|
768
|
2x 3800 NVME SSD
|
100
|
19
|
Amazon EC2 G5 instances are designed to accelerate graphics-intensive applications and machine learning inference. They can also be used to train simple to moderately complex machine learning models.
Features:
- 2nd generation AMD EPYC processors (AMD EPYC 7R32)
- Up to 8 NVIDIA A10G Tensor Core GPUs
- Up to 100 Gbps of network bandwidth
- Up to 7.6 TB of local NVMe local storage
G5 instances have the following specs:
- 2nd Generation AMD EPYC processors
- EBS Optimized
- Enhanced Networking†
Use Cases
Graphics-intensive applications such as remote workstations, video rendering, and cloud gaming to produce high fidelity graphics in real time. Training and inference deep learning models for machine learning use cases such as natural language processing, computer vision, and recommender engine use cases.
G4dn
Instance
|
GPUs
|
vCPU
|
Memory (GiB)
|
GPU Memory (GiB)
|
Instance Storage (GB)
|
Network Performance (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|
g4dn.xlarge
|
1
|
4
|
16
|
16
|
1 x 125 NVMe SSD
|
Up to 25
|
Up to 3.5
|
g4dn.2xlarge
|
1
|
8 |
32
|
16
|
1 x 225 NVMe SSD
|
Up to 25
|
Up to 3.5
|
g4dn.4xlarge
|
1
|
16
|
64
|
16
|
1 x 225 NVMe SSD
|
Up to 25
|
4.75
|
g4dn.8xlarge
|
1
|
32
|
128
|
16
|
1 x 900 NVMe SSD
|
50
|
9.5
|
g4dn.16xlarge
|
1
|
64
|
256
|
16
|
1 x 900 NVMe SSD
|
50
|
9.5
|
g4dn.12xlarge
|
4
|
48
|
192
|
64
|
1 x 900 NVMe SSD
|
50
|
9.5
|
g4dn.metal
|
8
|
96
|
384
|
128
|
2 x 900 NVMe SSD
|
100
|
19
|
Amazon EC2 G4dn instances are designed to help accelerate machine learning inference and graphics-intensive workloads.
Features:
- 2nd Generation Intel Xeon Scalable Processors (Cascade Lake P-8259CL)
- Up to 8 NVIDIA T4 Tensor Core GPUs
- Up to 100 Gbps of networking throughput
- Up to 1.8 TB of local NVMe storage
All instances have the following specs:
- 2.5 GHz Cascade Lake 24C processors
- Intel AVX, Intel AVX2, Intel AVX-512, and Intel Turbo
- EBS Optimized
- Enhanced Networking†
Use Cases
Machine learning inference for applications like adding metadata to an image, object detection, recommender systems, automated speech recognition, and language translation. G4 instances also provide a very cost-effective platform for building and running graphics-intensive applications, such as remote graphics workstations, video transcoding, photo-realistic design, and game streaming in the cloud.
G4ad
Instance
|
GPUs
|
vCPU
|
Memory (GiB)
|
GPU Memory (GiB)
|
Instance Storage (GB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|
g4ad.xlarge
|
1
|
4
|
16
|
8
|
1 x 150 NVMe SSD
|
Up to 10
|
Up to 3
|
g4ad.2xlarge
|
1
|
8
|
32
|
8
|
1 x 300 NVMe SSD
|
Up to 10
|
Up to 3
|
g4ad.4xlarge
|
1
|
16
|
64
|
8
|
1 x 600 NVMe SSD
|
Up to 10
|
Up to 3
|
g4ad.8xlarge
|
2
|
32
|
128
|
16
|
1 x 1200 NVMe SSD
|
15
|
3
|
g4ad.16xlarge
|
4
|
64
|
256
|
32
|
1 x 2400 NVMe SSD
|
25
|
6
|
Amazon EC2 G4ad instances provide the best price performance for graphics intensive applications in the cloud.
Features:
- 2nd Generation AMD EPYC Processors (AMD EPYC 7R32)
- AMD Radeon Pro V520 GPUs
- Up to 2.4 TB of local NVMe storage
All instances have the following specs:
- Second generation AMD EPYC processors
- EBS Optimized
- Enhanced Networking†
Use Cases
Graphics-intensive applications, such as remote graphics workstations, video transcoding, photo-realistic design, and game streaming in the cloud.
Trn2
Instance Size
|
Available in EC2 UltraServers
|
Trainium2 Chips
|
Accelerator Memory (TB)
|
vCPUs
|
Memory (TB)
|
Instance Storage (TB)
|
Network Bandwidth (Tbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|---|
trn2.48xlarge
|
No
|
16
|
1.5
|
192
|
2
|
4 x 1.92 NVMe SSD
|
3.2
|
80 |
trn2u.48xlarge
|
Yes (Preview)
|
16
|
1.5
|
192
|
2
|
4 x 1.92 NVMe SSD
|
3.2
|
80
|
Amazon EC2 Trn2 instances, powered by AWS Trainium2 chips, are purpose built for high-performance generative AI training and inference of models with hundreds of billions to trillion+ parameters.
Features:
- 16 AWS Trainium2 chips
- Supported by AWS Neuron SDK
- 4th Generation Intel Xeon Scalable processor (Sapphire Rapids 8488C)
- Up to 12.8 Tbps third-generation Elastic Fabric Adapter (EFA) networking bandwidth
- Up to 8 TB local NVMe storage
- High-bandwidth, intra-instance, and inter-instance connectivity with NeuronLink
- Deployed in Amazon EC2 UltraClusters and available in EC2 UltraServers (available in preview)
- Amazon EBS-optimized
- Enhanced networking
Use Cases
Training and inference of the most demanding foundation models including large language models (LLMs), multi-modal models, diffusion transformers and more to build a broad set of next-generation generative AI applications.
Trn1
Instance Size
|
Trainium Chips
|
Accelerator Memory (GB)
|
vCPUs
|
Memory (GiB)
|
Instance Storage (GB)
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|
trn1.2xlarge
|
1
|
32
|
8
|
32
|
1 x 500 NVMe SSD |
Up to 12.5
|
Up to 20 |
trn1.32xlarge
|
16
|
512
|
128
|
512
|
4 x 2000 NVMe SSD |
800
|
80
|
trn1n.32xlarge
|
16
|
512
|
128
|
512
|
4 x 2000 NVMe SSD
|
1600
|
80
|
Amazon EC2 Trn1 instances, powered by AWS Trainium chips, are purpose built for high-performance deep learning training while offering up to 50% cost-to-train savings over comparable Amazon EC2 instances.
Features:
- 16 AWS Trainium chips
- Supported by AWS Neuron SDK
- 3rd Generation Intel Xeon Scalable processor (Ice Lake SP)
- Up to 1600 Gbps second-generation Elastic Fabric Adapter (EFA) networking bandwidth
- Up to 8 TB local NVMe storage
- High-bandwidth, intra-instance connectivity with NeuronLink
- Deployed in EC2 UltraClusters that enable scaling up to 30,000 AWS Trainium accelerators, connected with a petabit-scale nonblocking network, and scalable low-latency storage with Amazon FSx for Lustre
- Amazon EBS-optimized
- Enhanced networking
Use Cases
Deep learning training for natural language processing (NLP), computer vision, search, recommendation, ranking, and more
Inf2
Instance Size
|
Inferentia2 Chips
|
Accelerator Memory (GB)
|
vCPU
|
Memory (GiB)
|
Local Storage
|
Inter-accelerator Interconnect
|
Network Bandwidth (Gbps)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|---|
inf2.xlarge
|
1
|
32
|
4
|
16
|
EBS Only
|
NA
|
Up to 15
|
Up to 10
|
inf2.8xlarge
|
1
|
32
|
32
|
128
|
EBS Only
|
NA
|
Up to 25
|
10
|
inf2.24xlarge
|
6
|
192
|
96
|
384
|
EBS Only
|
Yes
|
50
|
30
|
inf2.48xlarge
|
12
|
384
|
192
|
768
|
EBS Only
|
Yes
|
100
|
60
|
Amazon EC2 Inf2 instances are purpose built for deep learning inference. They deliver high performance at the lowest cost in Amazon EC2 for generative artificial intelligence models, including large language models and vision transformers. Inf2 instances are powered by AWS Inferentia2. These new instances offer 3x higher compute performance, 4x higher accelerator memory, up to 4x higher throughput, and up to 10x lower latency compared to Inf1 instances
Features:
- Up to 12 AWS Inferentia2 chips
- Supported by AWS Neuron SDK
- Dual AMD EPYC processors (AMD EPYC 7R13)
- Up to 384 GB of shared accelerator memory (32 GB HBM per accelerator)
- Up to 100 Gbps networking
Use Cases
Natural language understanding (advanced text analytics, document analysis, conversational agents), translation, image and video generation, speech recognition, personalization, fraud detection, and more.
Inf1
Instance Size
|
Inferentia chips
|
vCPUs
|
Memory (GiB)
|
Instance Storage
|
Inter-accelerator Interconnect
|
Network Bandwidth (Gbps)***
|
EBS Bandwidth
|
---|---|---|---|---|---|---|---|
inf1.xlarge
|
1
|
4
|
8
|
EBS only
|
N/A
|
Up to 25
|
Up to 4.75
|
inf1.2xlarge
|
1
|
8
|
16
|
EBS only
|
N/A
|
Up to 25
|
Up to 4.75
|
inf1.6xlarge
|
4
|
24
|
48
|
EBS only
|
Yes
|
25
|
4.75
|
inf1.24xlarge
|
16
|
96
|
192
|
EBS only
|
Yes
|
100
|
19
|
Amazon EC2 Inf1 instances are built from the ground up to support machine learning inference applications.
Features:
- Up to 16 AWS Inferentia Chips
- Supported by AWS Neuron SDK
- High frequency 2nd Generation Intel Xeon Scalable processors (Cascade Lake P-8259L)
- Up to 100 Gbps networking
Use Cases
Recommendation engines, forecasting, image and video analysis, advanced text analytics, document analysis, voice, conversational agents, translation, transcription, and fraud detection.
DL1
Instance Size
|
vCPU
|
Gaudi Accelerators
|
Instance Memory (GiB)
|
Instance Storage (GB)
|
Accelerator Peer-to-Peer Bidirectional (Gbps)
|
Network Bandwidth (Gbps)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|
dl1.24xlarge
|
96 |
8 |
768 |
4 x 1000 NVMe SSD |
100 |
400 |
19 |
Amazon EC2 DL1 instances are powered by Gaudi accelerators from Habana Labs (an Intel company). They deliver up to 40% better price performance for training deep learning models compared to current generation GPU-based EC2 instances.
Features:
- 2nd Generation Intel Xeon Scalable Processor (Cascade Lake P-8275CL)
- Up to 8 Gaudi accelerators with 32 GB of high bandwidth memory (HBM) per accelerator
- 400 Gbps of networking throughput
- 4 TB of local NVMe storage
DL1 instances have the following specs:
- 2nd Generation Intel Xeon Scalable Processor
- Intel AVX†, Intel AVX2†, Intel AVX-512, Intel Turbo
- EBS Optimized
- Enhanced Networking†
Use Cases
Deep learning training, object detection, image recognition, natural language processing, and recommendation engines.
DL2q
Instance Size
|
Qualcomm AI 100 Accelerators
|
Accelerator Memory (GB)
|
vCPU
|
Memory (GiB)
|
Local Storage
|
Inter-accelerator Interconnect
|
Network Bandwidth (Gbps)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|---|
dl2q.24xlarge
|
8
|
128
|
96
|
768
|
EBS Only
|
No
|
100
|
19
|
Amazon EC2 DL2q instances , powered by Qualcomm AI 100 accelerators, can be used to cost-efficiently deploy deep learning (DL) workloads in the cloud or validate performance and accuracy of DL workloads that will be deployed on Qualcomm devices.
Features:
- 8 Qualcomm AI 100 accelerators
- Supported by Qualcomm Cloud AI Platform and Apps SDK
- 2nd Generation Intel Xeon Scalable Processors (Cascade Lake P-8259CL)
- Up to 128 GB of shared accelerator memory
- Up to 100 Gbps networking
Use Cases
Run popular DL and generative AI applications, such as content generation, image analysis, text summarization, and virtual assistants.; Validate AI workloads before deploying them across smartphones, automobiles, robotics, and extended reality headsets.
F2
Instance Name
|
FPGAs
|
vCPU
|
FPGA Memory HBM / DDR4
|
Instance Memory (GiB)
|
Local Storage (GiB)
|
Network Bandwidth (Gbps)
|
EBS Bandwidth (Gbps)
|
---|---|---|---|---|---|---|---|
f2.6xlarge
|
1
|
24
|
16 GiB/ 64 GiB
|
256
|
1x 940
|
12.5
|
7.5
|
f2.12xlarge
|
2
|
48
|
32 GiB / 128 GiB
|
512
|
2x 940
|
25
|
15
|
f2.48xlarge
|
8
|
192
|
128 GiB / 512 GiB
|
2,048
|
8x 940
|
100
|
60
|
Amazon EC2 F2 instances offer customizable hardware acceleration with field programmable gate arrays (FPGAs).
Features:
- Up to 8 AMD Virtex UltraScale+ HBM VU47P FPGAs with 2.9 million logic cells and 9024 DSP slices
- 3rd generation AMD EPYC processor
- 64 GiB of DDR4 ECC-protected FPGA memory
- Dedicated FPGA PCI-Express x16 interface
- Up to 100 Gbps of networking bandwidth
- Supported by FPGA Developer AMI and FPGA Development Kit
Use Cases
Genomics research, financial analytics, real-time video processing, big data search and analysis, and security.
VT1
Instance Size
|
U30 Accelerators
|
vCPU
|
Memory (GiB)
|
Network Bandwidth (Gbps)
|
EBS Bandwidth (Gbps)
|
1080p60 Streams
|
4Kp60 Streams
|
---|---|---|---|---|---|---|---|
vt1.3xlarge
|
1 |
12 |
24 |
3.125 |
Up to 4.75 |
8 |
2 |
vt1.6xlarge
|
2 |
24 |
48 |
6.25 |
4.75 |
16 |
4 |
vt1.24xlarge
|
8 |
96 |
192 |
25 |
19 |
64 |
16 |
Amazon EC2 VT1 instances are designed to deliver low cost real-time video transcoding with support for up to 4K UHD resolution.
Features:
- 2nd Generation Intel Xeon Scalable Processors (Cascade Lake P-8259CL)
- Up to 8 Xilinx U30 media accelerator cards with accelerated H.264/AVC and H.265/HEVC codecs
- Up to 25 Gbps of enhanced networking throughput
- Up to 19 Gbps of EBS bandwidth
All instances have the following specs:
- 2nd Generation Intel Xeon Scalable Processors
- Intel AVX†, Intel AVX2†, Intel AVX-512, Intel Turbo
- EBS Optimized
- Enhanced Networking†
Use Cases
Live event broadcast, video conferencing, and just-in-time transcoding.
Footnotes
Each vCPU is a thread of either an Intel Xeon core or an AMD EPYC core, except for T2 and m3.medium.
† AVX, AVX2, AVX-512, and Enhanced Networking are only available on instances launched with HVM AMIs.
* This is the default and maximum number of vCPUs available for this instance type. You can specify a custom number of vCPUs when launching this instance type. For more details on valid vCPU counts and how to start using this feature, visit the Optimize CPUs documentation page here.
*** Instances marked with "Up to" Network Bandwidth have a baseline bandwidth and can use a network I/O credit mechanism to burst beyond their baseline bandwidth on a best effort basis. For more information, see instance network bandwidth.