We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.
If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”
Customize cookie preferences
We use cookies and similar tools (collectively, "cookies") for the following purposes.
Essential
Essential cookies are necessary to provide our site and services and cannot be deactivated. They are usually set in response to your actions on the site, such as setting your privacy preferences, signing in, or filling in forms.
Performance
Performance cookies provide anonymous statistics about how customers navigate our site so we can improve site experience and performance. Approved third parties may perform analytics on our behalf, but they cannot use the data for their own purposes.
Allowed
Functional
Functional cookies help us provide useful site features, remember your preferences, and display relevant content. Approved third parties may set these cookies to provide certain site features. If you do not allow these cookies, then some or all of these services may not function properly.
Allowed
Advertising
Advertising cookies may be set through our site by us or our advertising partners and help us deliver relevant marketing content. If you do not allow these cookies, you will experience less relevant advertising.
Allowed
Blocking some types of cookies may impact your experience of our sites. You may review and change your choices at any time by selecting Cookie preferences in the footer of this site. We and selected third-parties use cookies or similar technologies as specified in the AWS Cookie Notice.
Your privacy choices
We display ads relevant to your interests on AWS sites and on other properties, including cross-context behavioral advertising. Cross-context behavioral advertising uses data from one site or app to advertise to you on a different company’s site or app.
To not allow AWS cross-context behavioral advertising based on cookies or similar technologies, select “Don't allow” and “Save privacy choices” below, or visit an AWS site with a legally-recognized decline signal enabled, such as the Global Privacy Control. If you delete your cookies or visit this site from a different browser or device, you will need to make your selection again. For more information about cookies and how we use them, please read our AWS Cookie Notice.
Amazon Elastic Compute Cloud (Amazon EC2) UltraServers are ideal for customers seeking the highest AI training and inference performance for models at the trillion-parameter scale. UltraServers connect multiple EC2 instances using a dedicated, high-bandwidth, low-latency accelerator interconnect enabling you to leverage a tightly-coupled mesh of accelerators across EC2 instances, and access significantly more compute and memory than standalone EC2 instances.
EC2 UltraServers are ideal for the largest models that require more memory and more memory bandwidth than standalone EC2 instances can provide. The UltraServer design uses the intra-instance accelerator connectivity to connect multiple instances into one node, unlocking new capabilities. For inference, UltraServers help deliver industry-leading response time to create the best real-time experiences. For training, UltraServers boost model training speed and efficiency with faster collective communication for model parallelism as compared to standalone instances. EC2 UltraServers support EFA networking and when deployed in EC2 UltraClusters enable scale-out distributed training across tens of thousands of accelerators on a single petabit scale, non-blocking network. By delivering higher performance for both training and inference, UltraServers accelerate your time to market and help you deliver real-time applications powered by the most performant, next-generation foundation models.
Benefits
Train and deploy models at the trillion+ parameter scale
UltraServers enable efficient training and inference of models with hundreds of billions to trillions of parameters by linking a larger set of accelerators with a high-bandwidth, low-latency interconnect to deliver more compute and memory than standalone EC2 instances.
Reduce inference latency for real-time applications
UltraServers enable real-time inference for ultra-large models that demand substantial memory and memory bandwidth resources beyond what a single EC2 instance can offer.
Reduce time to train by extending model parallelism to more accelerators
UltraServers enable faster collective communication for model parallelism as compared to standalone instances, helping you reduce your time to train.
Features
Dedicated, high-bandwidth, and low-latency accelerator interconnect
You can launch instances into an UltraServer and leverage a dedicated, high-bandwidth, and low-latency accelerator interconnect across these instances. UltraServers enable access to a larger number of accelerators connected with this dedicated interconnect, delivering significantly more compute and memory in a single node than standalone EC2 instances.
High-performance networking
EC2 UltraServers deployed in EC2 UltraClusters are interconnected with petabit-scale EFA networking to improve performance for distributed training workloads.
High-performance storage
You can use EC2 UltraServers together with high-performance storage solutions such as Amazon FSx for Lustre, fully managed shared storage built on the most popular high-performance parallel file system. You can also use virtually unlimited cost-effective storage with Amazon Simple Storage Service (Amazon S3).
Built on the Nitro system
EC2 UltraServers are built on the AWS Nitro System, a rich collection of building blocks that offloads many of the traditional virtualization functions to dedicated hardware and software. Nitro delivers high performance, high availability, and high security, reducing virtualization overhead.
Instances supported
Trn2 instances
Powered by AWS Trainium2 chips, Trn2 instances in a Trn2 UltraServer configuration (available in preview) enable you to scale up to 64 Trainium2 chips connected with NeuronLink, the dedicated high- bandwidth, low-latency interconnect for AWS AI chips. Trn2 UltraServers provide breakthrough performance in Amazon EC2 for generative AI training and inference.