Solve large computational problems and gain new insights using the power of HPC on AWS

Create a Free Account
Contact Sales

High Performance Computing (HPC) allows scientists and engineers to solve complex, compute-intensive problems. HPC applications often require high network performance, fast storage, large amounts of memory, very high compute capabilities, or all of these. AWS enables you to increase the speed of research and reduce time-to-results by running HPC in the cloud and scaling to larger numbers of parallel tasks than would be practical in most on-premises environments. AWS helps to reduce costs by providing CPU, GPU, and FPGA servers on-demand, optimized for specific applications, and without the need for large capital investments.


hpc-01

Instantly launch or scale up High Performance Computing clusters on AWS. By eliminating job queue times and scaling your cluster as high as needed, when needed, you can reduce the time to market or publication.

hpc-02

Focus on applications and research output over infrastructure maintenance and upgrades. When AWS upgrades hardware, you can gain access instantaneously — simply rewrite your cluster configuration file and reboot to move to the latest hardware.

hpc-03

Let your research dictate infrastructure, not the other way around. With the flexible configuration options AWS provides, you can start with your hypothesis and create HPC clusters that are optimized for your unique application requirements – GPU today, CPU tomorrow.

hpc-04

In addition to core service options for compute, storage, and databases, take advantage of the breadth of services and partners in the AWS ecosystem to enhance your workload. Options range from familiar solutions like NICE and Thinkbox to experimental builds with AWS Lambda.

hpc-05

Collaborate without compromising on security. Every AWS service provides encryption and options to grant granular permissions for each user while maintaining the ability to share data across approved users. Build solutions compliant with HIPAA, FISMA, FedRAMP, PCI, and more.

hpc-06

Let every dollar contribute meaningfully to your mission. Choose from a range of AWS services and only pay for what you use. No more paying for idle compute capacity, no long-term contracts, and no complex licensing involved. Optimize costs further with Amazon EC2 Spot Instances. 

  • Life Sciences

    Genomics

    The Algorithms, Machine, and People (AMP) Lab at UC Berkeley leveraged AWS to quickly scale the compute resources needed to analyze the algorithms that are used in genomics work. Learn more >>

    Computational Chemistry

    Novartis built a platform leveraging AWS to run approximately 87,000 compute cores to conduct 39 years of computational chemistry in 9 hours for a cost of $4,232. Learn more >>

    Biological Systems Simulation

    Penn State moved its research portal to AWS and made it easy for 6,000 researchers worldwide to design more than 50,000 synthetic DNA sequences. Learn more >>

    Protein Modeling

    The Computer Science department at San Francisco State University used Amazon EC2 to reduce costs and turnaround time to run machine learning workloads.  Learn more >>

    logo_amplab
    logo_novartis
    logo_penn-state
    logo_sfsu
  • Financial Services

    Capital Management and Reporting

    MAPFRE saved 88 percent on infrastructure costs and gained the ability to spin up a supercomputer on demand and shut it down when finished. Learn more >>

    Risk Management Portfolio Optimization

    Yuanta Securities Korea benefits from increased speeds and lower costs by running financial models on AWS to assess market risk. Learn more >>

    Contract Pricing and Valuation  

    Aon Benfield moved its infrastructure to AWS and built a processing system that reduced policy recalculation from hours or days to minutes. Learn more >>

    logo_mapfre
    logo_yuanta
    logo_aon
  • Manufacturing

    Computational Fluid Dynamics (CFD)

    TLG Aerospace used EC2 Spot Instances to access more memory and cores at a lower cost, allowing them to scale the number and size of increasingly demanding simulations. Learn more >>

    Engineering Simulation

    Ansys ran a simulations with Enhanced Networking-compatible EC2 instances and demonstrated near-ideal scalability well past 1000 cores and a reduced overall solution time even beyond 2000 cores. Learn more >>

    logo_tlg-aerospace
    logo_ansys
  • Energy & Earth Sciences

    Weather Simulation

    The Weather Company redesigned its big data platform, forecasting systems, and applications to run natively in a cloud environment and reduced their on-premises environments from 13 to six data centers, freeing engineers to build network and application efficiency. Learn more >>

    Reservoir Simulation

    Rock Flow Dynamics used on-demand computing resources to run workloads to optimize the location of oil wells and water injection wells. What would have taken several years to complete was done over a 12 day period using AWS resources. Learn more >>

    Geographic Information Systems (GIS)

    Digital Globe used AWS to deliver petabytes of high-resolution Earth imagery, data, and analysis to its customers in weeks instead of months while saving on costs. Learn more >>

    Operations, Management, and Analytics

    Fugro Roames used AWS and Amazon EC2 Spot Instances to enable Ergon Energy to reduce the annual cost of vegetation management from AU$100 million to AU$60 million. Learn more >>

    logo_weathercompany
    logo_rfd
    logo_digital-globe
    logo_roames
  • Semiconductors

    Electronics Design Automation

    Cadence Design Systems used AWS to isolate workloads from one another and ensure users and applications didn’t compete for resources, which resulted in reduced regression times, quicker iterations, and shifted focus on optimization and agility. Learn more >>

    Electronics Simulation

    Cypress Semiconductor implemented parallel computing on AWS and used COMSOL Multiphysics to simulate an electromagnetic field distribution in a capacitive sensor assembly and reported reduced simulation time from weeks to hours. Learn more >>

    logo_cadence
    logo_cypress
Continuous Delivery

High Performance Computing workloads on AWS are run on virtual servers, known as instances, enabled by Amazon Elastic Compute Cloud (Amazon EC2). Amazon EC2 provides secure, resizable compute capacity in the cloud and is offered in a wide range of instance types so you can choose one optimized for your workload.

 Instance Type
Recommended HPC Use
Technical Highlights

C4 and C5

Compute Optimized

Compute-bound workloads, such as engineering and financial simulations, materials science and genomics processing, seismic processing, digital and analog simulations, fluid dynamics, computational lithography and metrology, weather simulations, and many more
  • Based on Intel Haswell and Skylake processors
  • Provides up to 36 cores (72 vCPUs) and up to 144 GiB of memory
  • Highest clocks speeds available in EC2 instance types

M4

General Purpose

Applications and workloads requiring a balance of memory-to-cores, and for general purpose computing such as HPC management nodes, license servers, remote login nodes, and others
  • Based on Intel Haswell and Broadwell processors
  • Provides up to 32 cores (64 vCPUs) and up to 256 GiB of memory

R4

Memory Optimized

Applications that require a higher ratio of memory-to-cores than C4/C5 or M4 instances, including memory-intensive engineering and scientific simulations, semiconductor mask verification, and many others
  • Based on Intel Broadwell CPUs
  • Provides up to 32 cores (64 vCPUs) and up to 488 GiB of memory

P2

GPU Optimized

Machine learning, engineering simulations, computational finance, seismic analysis, molecular modeling, genomics, rendering, and other GPU compute workloads
  • Provides up to 16 NVIDIA K80 GPUs in a single EC2 instance, with each GPU providing 2,496 parallel processing cores and 12GiB of GPU memory

F1

FGPA Optimized

Parallel, hardware accelerated applications including video analytics, image processing, financial computing, genomics, and accelerated data analytics and search
  • Provides up to 8 Xilinx UltraScale+ VU9P FPGA devices in a single EC2 instance

G3

Graphics Optimized

High performance graphical applications, including graphical remote desktops, 3D modeling and simulation, medical and geospacial imaging, and video content delivery
  • Provides up to 4 NVIDIA Kepler or Maxwell GPUs in a single EC2 instance
  • Optimized for graphics processing and remote visualization
  • Available with Amazon AppStream 2.0, a fully managed application streaming service allowing pre and post-processing of HPC workloads. Deliver HPC visualization applications to large groups of users on any desktop with an HTML5 browser.
  • G2 is utilized by Amazon WorkSpaces Graphics bundles which enables GPU-accelerated virtual Windows desktops in the cloud. WorkSpaces Graphics bundles are designed for engineers and 3D application developers to use as an alternative to expensive graphics-capable workstations.

X1

High Memory

Applications that require the highest amounts of memory per core, including in-memory analytics graph and sparse matrix processing, semiconductor timing analysis, and others
  • Based on Intel Haswell CPUs
  • Provides up to 64 cores (128 vCPUs) and up to 1,952 GiB of memory

Continuous Delivery

High Performance Computing workload management gains new levels of flexibility in the cloud, making resource and job orchestration an important consideration for your workload. AWS provides a range of solutions for workload orchestration: fully-managed services enable you to focus more on job requests and output over provisioning, configuring and optimizing the cluster and job scheduler, while self-managed solutions enable you to configure and maintain cloud-native clusters yourself, leveraging traditional job schedulers to use on AWS or in hybrid scenarios.

 AWS Offering
Description
Highlights
AWS Batch AWS Batch is a fully-managed service that enables you to easily run large-scale compute workloads on the cloud without having to worry about resource provisioning or managing schedulers. Interact with AWS Batch via the web console, AWS CLI, or SDKs.
  • Fully-managed service
  • Focus on your jobs and their resources instead of infrastructure
  • Reduce costs by easily using EC2 Spot and Reserved Instances
  • Easily prioritize work across tens of thousands of cores
AWS Lambda Run code without provisioning or managing servers, paying only for the compute time you consume.  Define short-duration functions written in a number of languages and allow Lambda to manage execution at scale.
  • Fully-managed service
  • Optimized for short-duration operations
  • Lambda is “Serverless” – pay only for what you use while your functions are running
AWS Step Functions A fully-managed service that makes it easy to coordinate the components of distributed applications and microservices using visual workflows.
  • Fully-managed service
  • Easily integrated with AWS Batch, AWS Lambda, and other services

CfnCluster
An open-source framework that deploys a high-performance cluster on AWS with pre-installed open source batch schedulers and MPI libraries.
  • Open source software
  • Quickly deploy a cluster using third-party schedulers
  • Uses AWS CloudFormation for a base template 
EnginFrame HPC portal integrated with a wide range of open source and commercial batch scheduling systems. One-stop-shop for job submission, control and data management.
  • Runs on-premises, in the cloud or hybrid
  • "Single pane of glass” for multiple schedulers
  • Application templates

Continuous Delivery

AWS provides several options for storage, ranging from file systems attached to an EC2 instance to high performance object storage. Most HPC applications require shared access to data from multiple EC2 instances via a file system interface. AWS provides a native, scale-out shared file storage service (Amazon EFS) that provides a file system interface and file system semantics. HPC applications can also use AWS’ block storage offerings, either Amazon EBS or EC2 instance store, for general purpose working storage. Amazon S3 and Glacier provides low-cost storage options for long-term storage of large data sets.

 AWS Product
Description and recommended HPC usage
Highlights

Amazon EFS

 

A highly available and durable, multi-AZ, fully-managed file system

Recommended HPC Usage: Use as a shared file system for working storage

  • Scales to tens of thousands of cores
  • NFS mountable

Amazon EBS

 

Persistent block storage volumes for use with Amazon EC2 instances

Recommended HPC Usage: Use for high-IOPS and general purpose working storage

 

  • Lustre compatible
  • NFS mountable
  • Supports high-speed parallel systems via tools like Lustre and GPFS
  • Offers a range of choices for speed and cost optimization

Amazon EC2 Instance Store

 

Block storage included at no additional charge with select Amazon EC2 instance types

Recommended HPC Usage: Use for read-often temporary working storage

  • Included with select EC2 instance types
  • Fast I/O
  • Ephemeral Storage

Amazon S3

 

Object storage built to store and retrieve any amount of data from anywhere

Recommended HPC Usage: Primary durable and scalable storage for HPC data

  • Highly available
  • Highly durable
  • API accessible with PUT and GET requests

Amazon Glacier

 

A secure, durable, and extremely low-cost cloud storage service for data archiving an dlong-term backup

Recommended HPC Usage: Use for long-term, lower-cost archival of HPC data

  • Life cycle tools archive data automatically
  • Extremely economical
  • Retrieval times on the order of hours

Continuous Delivery

The AWS network is designed for scale. Whether your application requires thousands of cores for one tightly-coupled workload, hundreds-of-thousands of cores for embarrassingly-parallel, high-throughput (HTC) applications, or a mixture of both, the AWS network offers performance (high bandwidth, low latency) and scalability.

AWS optimizes and custom builds their own hardware specifically for AWS infrastructure. Cut-through routing combined with AWS’s large scale means even the biggest customers see consistent latency and high bandwidth when using the most challenging application communication patterns. Enhanced networking provides higher I/O performance and lower CPU utilization compared to traditional virtualized network interfaces. This feature provides higher packet per second (PPS) performance, lower inter-instance latencies, and very low network jitter. Enhanced Networking is available in one of two ways and depending on the instance type: Intel 82599 or Amazon ENA.

Networking Feature
Description and EC2 Instance Type Compatibility
Benefits
Cluster Placement Groups

Cluster placement Groups are logical groupings or clusters of instances in the selected AWS region.

EC2 Instance Type Compatibility: All instance types that support enhanced networking can be launched within a cluster Placement Group. Learn more >>

  • Allow for reliably low latency with up to 20Gbps bandwidth between instances
  • Elastically scalable as desired

Generation One Enhanced Networking:

Intel 82599

  

 

The Intel 82599 Virtual Function interface supports network speeds of up to 10 Gbps for supported instance types and provides higher I/O performance and lower CPU utilization when compared to traditional virtualized network interfaces.

EC2 Instance Type Compatibility: The C3, C4, D2, I2, R3, and M4 (excluding m4.16xlarge) instance types are compatible 82599. Learn more >>

  • Higher I/O performance and lower CPU utilization compared to traditional implementations
  • Higher packet per second (PPS) performance
  • Lower inter-instance latencies
  • Very low network jitter

Generation Two Enhanced Networking:

Elastic Network Adapter (ENA)

Elastic Network Adapter (ENA) is a custom network interface optimized to deliver high throughput and packet per second (PPS) performance.

EC2 Instance Type Compatibility: ENA is currently supported on P2, R4, X1, and m4.16xlarge instance types. Learn more >>

  • All the advantages of generation one 
  • Future-proofed driver: designed to support up to 400 Gbps networking without requiring a driver change
  • Utilize up to 20Gbps of network bandwidth on certain EC2 instance types

Continuous Delivery

From preparing simulation input data to interpreting computing job outputs, high performance graphics tasks are part of many HPC workloads. AWS offers several products to improve the performance, cost and flexibility of running OpenGL, Direct/X and other graphics applications. You can accelerate graphics performance by using the GPU-powered G2 and G3 instances or Elastic GPU, and stream Windows graphics with AppStream 2.0, WorkSpaces, or NICE DCV. If you prefer a Linux-based graphics platform, combining the streaming performance of NICE DCV and the EnginFrame HPC portal can deliver end-to-end workflows to end users across on-premises, hybrid cloud, or full-AWS configurations.

 Offering
Description
Highlights
NICE DCV A secure streaming protocol optimized for high end graphics, with dynamic bandwidth management
  • Move pixels and keep HPC data centralized
  • Enable remote access to Linux and Windows 3D applications
  • Fluid and responsive experience over a wide network area
  • Consistent experience on premises and on AWS
NICE EnginFrame An HPC Portal with built-in interactive session management and batch-interactive workflow support
  • One-stop-shop for all HPC user needs
  • Simplify collaboration
  • Consistent experience on premises and on AWS
Amazon EC2 Elastic GPU and G2 Instances
Allow you to easily attach low-cost graphics acceleration to current generation EC2 instances
  • Ideal If you need a small amount of GPU for graphics acceleration, or have applications that could benefit from some GPU, but also require high amounts of compute, memory, or storage
  • Capable of running a variety of graphics workloads, such as 3D modeling and rendering, with similar workstation performance compared to direct-attached GPUs.
Amazon AppStream 2.0
A fully managed, secure application streaming service that allows you to stream desktop applications from AWS to any device running a web browser
  • Visualization applications run next to your HPC data ensuring a high quality, low latency visualization experience
  • Users have secure, anywhere, anytime access to their applications so they can be productive wherever there is a web connection
  • Application delivery using NICE DCV protocol which is optimized for graphics
Amazon Workspaces A fully managed, secure Desktop-as-a-Service (DaaS) solution which runs on AWS. WorkSpaces includes GPU-accelerated bundles, which supports engineering, design, and architectural applications while providing the benefits of security, economics, flexibility, and agility in the cloud.
  • Faster visualization of simulation results because your apps can reside next to your data in the cloud
  • Support for 3D application development, 3D modeling, CAD, CAM, and CAE tools
  • Desktop streaming to a multitude of supported devices including Windows and Mac PCs, PCoIP zero clients, Chromebooks, iPads, Fire tablets, Android tablets, and even select smartphones
Continuous Delivery

AWS offers you a pay-as-you-go approach for pricing for over 70 cloud services. With AWS you pay only for the individual services you need, for as long as you use them, and without requiring long-term contracts or complex licensing. AWS pricing is similar to how you pay for utilities like water or electricity. You only pay for the services you consume, and once you stop using them, there are no additional costs or termination fees. Learn more about how pricing works on AWS >>

There are three main ways to pay for your compute capacity on Amazon EC2: On-Demand, Reserved Instances, and Spot Instances.

Compute Pricing Model
Description
Recommended HPC Use:
On Demand Instances With On-Demand instances, you pay for compute capacity by the hour with no long-term commitments or upfront payments. You can increase or decrease your compute capacity depending on the demands of your application and only pay the specified hourly rate for the instances you use.
  • Users that prefer the low cost and flexibility of Amazon EC2 without any up-front payment or long-term commitment
  • Applications being developed or tested on Amazon EC2 for the first time (POCs)
  • Applications with short-term, spiky, or unpredictable workloads that cannot be interrupted
  • Urgent and high-priority workloads
Spot Instances Spot Instances is a pricing model that enables you to bid on unused Amazon EC2 capacity at whatever price you choose. When your bid exceeds the Spot price, you gain access to the available Spot Instances and run as long as the bid exceeds the Spot Price. Historically, the Spot price has been 50% to 93% lower than the on-demand price. Learn more about optimizing scientific computing costs with Spot Instances >>
  • Workloads that can tolerate interruptions
  • Applications that have flexible start and end times
  • Applications that are only feasible at very low compute prices
Reserved Instances Reserved Instances provide you with a significant discount (up to 75%) compared to On-Demand instance pricing. In addition, when Reserved Instances are assigned to a specific Availability Zone, they provide a capacity reservation, giving you additional confidence in your ability to launch instances when you need them.
  • Customers that can commit to using EC2 over a 1 or 3 year term to reduce their total computing costs
  • Applications with steady state usage

AWS Partners provide professional services or software solutions to enable workloads on AWS. Browse our selection of featured partners and learn more.

 

Sign up for an account and launch a sample HPC workload today.

Homepage_v6-01

Your account will be within the AWS Free Tier, which enables you to gain free, hands-on experience with the AWS platform, products, and services.

Homepage_v6-03

Build your HPC production solution quickly and easily once you're ready.

Get Started for Free