High Performance Computing (HPC) allows scientists and engineers to solve complex, compute-intensive and data-intensive problems. HPC applications often require high network performance, fast storage, large amounts of memory, very high compute capabilities, or all of these. AWS allows you to increase the speed of research and reduce time-to-results by running high performance computing in the cloud and scaling to larger numbers of parallel HPC tasks than would be practical in most on-premise HPC environments. AWS helps to reduce costs by providing CPU, GPU, and FPGA servers on-demand, optimized for specific applications, and without the need for large capital investments. You have access to a full-bisection, high bandwidth network for tightly-coupled, IO-intensive and storage-intensive workloads, which enables you to scale out across thousands of cores for faster results.
This short video helps explain the benefits of running your High Performance Computing cluster jobs on Amazon Web Services. The video covers the basic benefits of running cloud infrastructure, as well as some of the unique benefits of running in the AWS cloud. You'll learn about getting instant access to thousands of Intel Xeon processors with Enhanced Networking, and learn about tools to help you easily create your HPC cluster, all with a pay as you go, no upfront costs required, pricing model.
HPC workloads are enabled on AWS by a range of available instance types, including Compute-Optimized (C family), General-purpose (M family), and Memory-optimized (R and X families) instances, as well as GPU (P family) and FPGA (F Family) instances. You can use these instance types just like other EC2 instances, but they also have been specifically engineered to provide high performance networking coupled with modern, high-performance CPU architectures. Using these instance types, you can scale to hundreds, thousands, or tens of thousands of CPU, GPU, or FPGA cores on-demand.
C4 instances are the latest generation of Amazon EC2 Compute-optimized instances. C4 instances are designed for compute-bound workloads, such as high-traffic front-end fleets, MMO gaming, media processing, transcoding, and High Performance Computing (HPC) applications.
C4 instances are available in five sizes, offering up to 36 vCPUs (18 physical, dedicated Intel Xeon V3 Haswell cores). C4 instances run at a base frequency of 2.9 GHz, and can deliver clock speeds as high as 3.5 GHz with Intel ® Turbo Boost. C4 instances provide access to advanced processor features including Intel AVX2 instructions, and control over P-States and C-States. Each C4 instance type is EBS-optimized by default and at no additional cost. This feature provides 500 Mbps to 4,000 Mbps of dedicated throughput to EBS above and beyond the general purpose 10Gbps network throughput provided to the instance.
C5 instances will be available in six sizes, and are based on Intel Xeon V5 Skylake™ cores. C5 instances provide access to advanced processor features including Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instructions. Each C5 instance type supports Enhanced Networking and the Elastic Network Adapter (ENA). C5 is EBS-optimized with dedicated up to 12Gbps of EBS throughput.
M4 instances offer a balance of compute, memory, and networking resources and are a good choice for many different types of applications, including HPC. M4 instances give you a choice of six sizes, from large up to 16xlarge.
The newest of the M4 instances, the m4.16xlarge, provides 64 vCPUs (32 physical, dedicated Intel Xeon V4 Broadwell CPU cores), 256 GiB of RAM, and up to 20Gbps of network performance through the use of AWS Elastic Network Adaptor and Enhanced Networking. M4.16xlarge is ideal for applications that require a higher ration of memory-to-cores than C4/C5, and that can benefit from having larger numbers of CPU cores in the same server.
The newest of the R family instances, the R4, provides 64 vCPUs (32 physical, dedicated Intel Xeon V4 Broadwell CPU cores), up to 488 GiB of RAM, and up to 20Gbps of network performance through the use of AWS Elastic Network Adaptor and Enhanced Networking. R is ideal for applications that require a higher ration of memory-to-cores than C4/C5 or M4.
P2 instances are ideally suited for machine learning, engineering simulations, computational finance, seismic analysis, molecular modeling, genomics, rendering, high performance databases, and other GPU compute workloads.
The P2 instance offers 16 NVIDIA K80 GPUs with a combined 192 Gigabytes (GB) of video memory, 40,000 parallel processing cores, 70 teraflops of single precision floating point performance, over 23 teraflops of double precision floating point performance, and GPUDirect technology for higher bandwidth and lower latency peer-to-peer communication between GPUs. P2 instances also feature up to 732 GB of host memory, up to 64 vCPUs using custom Intel Xeon E5-2686 v4 (Broadwell) processors, dedicated network capacity for I/O operation, and enhanced networking through the Amazon EC2 Elastic Network Adaptor. P2 instances allow customers to build and deploy compute-intensive applications using the CUDA parallel computing platform or the OpenCL framework without up-front capital investments.
Cluster instances can be launched within a Placement Group. All instances launched within a Placement Group have low latency, full bisection, 10 Gbps bandwidth between instances. Like many other Amazon EC2 resources, Placement Groups are dynamic and are elastically scalable as needed. You can also connect multiple Placement Groups to create very large high performance computing clusters for massively parallel processing.
AWS currently supports enhanced networking capabilities using SR-IOV (Single Root I/O Virtualization) for the C3 and I2 instance types. SR-IOV is a method of device virtualization that provides higher I/O performance and lower CPU utilization compared to traditional implementations. For supported Amazon EC2 instances, this feature provides higher packet per second (PPS) performance, lower inter-instance latencies, and very low network jitter.
Data has gravity; as datasets grow larger it becomes easier to move the compute closer to the data to reduce latency and increase throughput. With AWS big data storage and database services, such Amazon S3, Amazon Redshift, Amazon DynamoDB, and Amazon RDS, you have the perfect place to host your data for your high performance computing cluster. Furthermore, with Amazon Elastic Block Store (EBS) you can create large scale parallel file systems to meet the high volume, performance, and throughput requirements of your HPC workload.
You can save time and money by leveraging Spot Instances for your HPC workloads. Spot Instances is a pricing model that enables you to bid on unused Amazon EC2 capacity at whatever price you choose. When your bid exceeds the Spot price, you gain access to the available Spot Instances and run as long as the bid exceeds the Spot Price. Historically, the Spot price has been 50% to 93% lower than the on-demand price.
AWS Marketplace is an online store that provides an easy way for developers and IT professionals to discover and use software to run in the AWS Cloud. You can find a selection of high performance computing software ready to run in your cluster, such as the Univa Grid Engine resource management system or the Intel Lustre HPC file system, with just a few clicks directly from the AWS Marketplace.