AWS Government, Education, & Nonprofits Blog

The Evolution of High Performance Computing: Architectures and the Cloud

A guest blog by Jeff Layton, Principal Tech, AWS Public Sector

In High Performance Computing (HPC), users are performing computations that no one ever thought would be considered. For example, there are researchers performing a statistical analysis of the voting records of the Supreme Court, sequencing genomes of humans, plants, and animals, creating deep learning networks for object and facial recognition so that cars and Unmanned Aerial Vehicles (UAVs) can guide themselves, searching for new planets in the galaxy, looking for trends in human behavioral patterns, analyzing social patterns in user habits, targeting advertisement development and placement, and thousands of other applications.

From lotions to aircrafts, the products and services that are connected with HPC touch us each and every day, and we often don’t even realize it.

A great number of these applications are coming from the use of the massive amount of data that has been collected and stored. This is true of classic HPC applications or new HPC applications, such as deep learning that need massive data sets for learning and large stat sets for testing the model. These are very data-driven applications and their scale is getting larger every day.

A key feature of this “new” HPC is that it needs to be flexible and scalable to accommodate these new applications and the associated sea of data. New applications and algorithms are developed each year and their characteristics can vary widely, resulting in the need for increasingly diverse hardware support and new software architectures.

The cloud allows users to dynamically create architectures as they are needed, using the right amount of compute power (CPU or GPU), network, databases, data storage, and analysis tools. Rather than the classic model of fitting the application software to the hardware, the cloud allows the application software to define the infrastructure.

The cloud has a number of capabilities that map to the evolving nature of HPC, including:

  1. Scale and Elasticity
  2. Code as Infrastructure
  3. Ability to experiment

Scale and Elasticity

Thousands upon thousands of compute resources, massive storage capacity, and high-performance network resources are available worldwide via the cloud.

Combining scale and elasticity creates a capability for HPC cloud users that doesn’t exist for centralized shared HPC resources. If resources can be provisioned and scaled as needed and there is a large pool of resources, then waiting in job queues are a thing of the past. Each HPC user in the cloud can have access to their own set of HPC resources, such as compute, networking, and storage resources for their own specific applications with no need to share the resources with other users. They have zero queue time and can create architectures that their applications need.

Code as Infrastructure

Cloud computing also features the ability to build or assemble architectures or systems using only software (code), in which software serves as the template for provisioning hardware. Instead of having to assemble physical hardware in a specific location and manage such things as cabling, cabling labels, switch configuration, router software, and patching, HPC in the cloud allows the various components to be specified by writing a small amount of code, making it easy to expand or contract or even re-architect on-the-fly.

Code as infrastructure addresses the classic HPC problem of inflexible hardware and architecture. However, if a classic cluster architecture is needed, then that can be easily created in the cloud. If a different application needs a Hadoop architecture or perhaps a Spark architecture, then those too can be created. Only the software changes.

Ability to Experiment

As HPC continues to evolve, new applications are being developed that take advantage of experimentation, test, and iteration. These applications may involve new architectures or even re-thinking how the applications are written (re-interpretation). Having access to modular, fungible resources as a set of building blocks that can be configured and reconfigured as-needed is crucial for this new approach.

This will become even more important as HPC moves forward because the new wave of applications are heavily oriented toward massive data. Pattern recognition, machine learning, and deep learning are examples of these new applications and being able to create new architectures will allow these applications to flourish and develop based on the scale and flexibility of the cloud and corresponding economics.


 

See how HPC is used for open data and scientific computing here: www.aws.amazon.com/scico and www.aws.amazon.com/opendata. And check out Jeff’s previous blog The Evolution of High Performance Computing.