Organizations of all sizes, from large automotive and pharmaceutical firms to small financial and life sciences firms, have problems to solve that require processing a large amount of information using applications running highly parallel processes on scalable computing infrastructure. Solving these problems can be constrained by the amount of infrastructure on hand or in budget, which is often insufficient for the capacity or timing needs of the project. Further complexities include allocating budget for additional capital expenditures, prioritizing existing compute resources among different projects, provisioning machines, allocating storage, and managing facilities and operations staff. These challenges commonly result in either scarcity of compute resources or wasteful under-utilization of expensive resource investments.
Utilizing Amazon EC2 for these large computational problems can alleviate these challenges by providing access to elastic computing resources with the benefits of flexibility and cost efficiency. This allows organizations to:
Today, businesses and researchers with high performance computational requirements are utilizing Amazon Web Services (AWS) to run applications such as mapping genomes for scientific research, simulating aerospace and automotive designs for engineering activities, mining data for business intelligence and many other use cases.
Amazon EC2 provides resizable compute capacity in the cloud with the flexibility to choose from a number of different instance types to meet your computing needs. Each instance provides a predictable amount of dedicated compute capacity and is charged per instance-hour consumed.
The Amazon EC2 Cluster Compute instance type is specifically designed to combine high compute performance with high performance network capability to meet the needs of HPC applications. Unique to Cluster Compute instances is the ability to group them into clusters of instances for use with HPC applications. This is particularly valuable for those applications that rely on protocols like Message Passing Interface (MPI) for tightly coupled inter-node communication.
Cluster Compute instances function just like other Amazon EC2 instances but also offer the following features for optimal performance with HPC applications:
The Cluster Compute instance family currently contains a single instance type, the Cluster Compute Quadruple Extra Large with the following specifications:
23 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cc1.4xlarge
Cluster Compute instances are available today for Linux operating system use in the US – N. Virginia Region. There is a default usage limit for this instance type of 8 instances (providing 64 cores). If you wish to run more than 8 instances, please complete the Amazon EC2 instance request form.
Cluster Compute instances require booting from an EBS-backed Amazon Machine Image (AMI) using Hardware Virtual Machine (HVM) virtualization. Information on how to create an HVM AMI and how to launch instances as a cluster can be found in the concepts sections of the Amazon EC2 Developer Guide and User Guide.
Amazon EC2 makes it easy for you to build and customize Amazon Machine Images (AMIs) with the software you need. For HPC applications, you may want to consider using cluster configuration and management tools and application optimization tools from the following supporting solution providers:
Amazon Elastic MapReduce. Amazon Elastic MapReduce enables you to easily and cost-effectively process vast amounts of data utilizing a hosted Hadoop framework. Amazon Elastic MapReduce is built upon the scalable infrastructure of Amazon EC2, making it easy to quickly provision as much or as little capacity as you like. Amazon Elastic MapReduce lets you focus on analyzing your data without having to worry about the time-consuming set-up, management and tuning of Hadoop clusters. Learn more
Public Data Sets. Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications. By offering this important and useful data with cost-efficient services such as Amazon EC2, AWS hopes to provide researchers across a variety of disciplines and industries with tools to enable more innovation, more quickly. Learn more