Amazon EC2 G7 instances
High performance GPU acceleration for running AI inference, graphics, and data analytics workloads
Why Amazon EC2 G7
Amazon EC2 G7 instances, powered by NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs, deliver up to 4.6x AI inference performance compared to previous-generation G6 instances. Whether you are deploying conversational AI assistants, scaling video streaming, or running large scale data processing pipelines, G7 gives you the performance you need. With up to 8 GPUs and 700 Gbps of EFA-enabled network bandwidth, G7 instances help you run AI inference, graphics, and data analytics workloads more efficiently.
Features
NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs
Up to 8 NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs per instance with 32 GB memory per GPU, 5th Gen Tensor Cores, and 4th Gen RT Cores.
High performance networking and storage
G7 instances come with up to 700 Gbps of EFA-enabled networking throughput (7x compared to G6) enabling the low-latency, high bandwidth connectivity that AI inference, graphics-intensive applications, and GPU-accelerated data analytics workloads need to perform at their best. G7 instances support up to 7.6 TB local NVMe SSD storage, enabling you to keep large models and datasets close to compute, reduce data transfer overhead, and improve throughput.
Powered by custom Intel Xeon 6 processors
G7 instances are powered by custom Intel Xeon 6 processors with a sustained all-core turbo frequency of 3.9 GHz. With simultaneous multi-threading disabled, these custom processors deliver maximum per-core compute performance, accelerating workloads that depend on high-bandwidth data movement between CPU and GPU — such as recommender systems, retrieval-augmented generation (RAG) inference, and data analytics pipelines.
Advanced video encoding and decoding engines
Ninth-generation NVENC and sixth-generation NVDEC engines support 4:2:2 encoding and decoding for high-resolution video workflows, delivering 1.6x more concurrent video streams than previous-generation G6 instances.
Built on the AWS Nitro System
G7 instances are powered by the AWS Nitro System which handles networking, storage, and other I/O functions, and can deploy firmware updates, bug fixes, and optimizations while it remains operational. This increases stability and reduces downtime, which is critical to meeting training timelines and running AI applications in production.
Benefits
G7 instances deliver up to 4.6x higher performance for AI inference compared to G6 instances. With custom Intel Xeon 6 processors, 7x EFA-enabled bandwidth and 1.5x higher FP16 Flops than G6, you can deploy models with lower latency for applications like conversational assistants, content generation tools, and recommendation engines.
G7 instances deliver up to 2.1x performance for hybrid graphics-AI workloads such as ADAS and robotics simulations, AI-enabled AR/VR, gaming, video services, 3D rendering, and CAD workflows. Advanced video transcoding capabilities let you run 1.6x more concurrent video streams than G6.
With faster GPU memory and up to 700 Gbps of EFA-enabled networking bandwidth, G7 instances also provide the data transfer speed needed for GPU-accelerated analytics applications like vector databases and data frames.
Product Details
Instance types
|
Instance size
|
GPUs
|
GPU memory (GB)
|
vCPUs
|
System memory (GiB)
|
Instance storage (GB)
|
Network bandwidth (Gbps)
|
EBS Bandwidth (Gbps)
|
|---|---|---|---|---|---|---|---|
|
g7.2xlarge
|
1
|
32
|
8
|
32
|
1 x 600
|
Up to 60 |
Up to 8 |
|
g7.4xlarge
|
1
|
32 |
16 |
64 |
1 x 600 |
Up to 100 |
8 |
|
g7.8xlarge
|
1
|
32 |
32 |
128
|
1 x 950 |
Up to 100 |
16 |
|
g7.12xlarge
|
2 |
64
|
48 |
192 |
1 x 1900 |
175 |
20 |
|
g7.24xlarge
|
4
|
128 |
96 |
384 |
1 x 3800 |
350 |
40 |
|
g7.48xlarge
|
8 |
256 |
192 |
768 |
2 x 3800 |
700 |
80 |
|
g7.metal*
|
8 |
256 |
192 |
768 |
2 x 3800 |
700 |
80 |
*Coming soon
Customer testimonials
Volt AI
At Volt AI, our mission is protecting students, faculty, and communities through real-time AI video intelligence— across tens of thousands of live camera streams, every second of every day. That demands GPU infrastructure that can keep up. Amazon EC2 G7 instances powered by NVIDIA RTX PRO 4500 GPUs delivered over 2x throughput improvement over previous-generation instances for our FP16 TensorRT inference workloads, enabling us to monitor more concurrent video streams with lower latency and leaner infrastructure. G7 instances have directly translated into faster threat detection and more scalable deployments for the schools, campuses, and cities we protect.
Getting started with AI use cases
Amazon SageMaker AI is a fully managed service for building and deploying ML models. G7 instances integrate with Amazon SageMaker AI to deliver a managed, enterprise-grade experience for AI inference. (coming soon)
AWS Deep Learning AMIs (DLAMI) provides ML practitioners and researchers with the infrastructure and tools to accelerate DL in the cloud, at any scale. AWS Deep Learning Containers are Docker images preinstalled with DL frameworks to streamline the deployment of custom ML environments by letting you skip the complicated process of building and optimizing your environments from scratch.
If you prefer to manage your own containerized workloads through container orchestration services, you can deploy G7 instances with Amazon Elastic Kubernetes Service (Amazon EKS) or Amazon Elastic Container Service (Amazon ECS).
If you are looking to run and scale your high performance computing (HPC) workloads and build scientific and engineering models on AWS using Slurm, you can deploy G7 instances with AWS Parallel Computing Service (AWS PCS).
You can use various Amazon Machine Images (AMIs) offered by AWS and NVIDIA that come with the NVIDIA drivers installed.
Next steps
Start using Amazon EC2 G7 instances
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages