AWS Big Data Blog

Running a High Performance SAS Grid Manager Cluster on AWS with Intel Cloud Edition for Lustre

Chris Keyser is a Solutions Architect for Amazon Web Services

This post was co-authored by Margaret Crevar, Sr. Manager, Performance Validation at SAS. SAS is an AWS Technology Partner.

SAS (www.sas.com) is an integrated environment designed for business and advanced data analytics by enterprise and government organizations. SAS and AWS recently performed testing using the Intel Cloud Edition for Lustre* Software – Global Support (HVM), available on AWS Marketplace, to determine how well a standard workload performs on AWS using SAS Grid Manager. You can find the detailed results in the SAS® Grid® Manager 9.4 Testing on AWS using Intel® Lustre whitepaper. In this post, we’ll take a look at an approach to scaling the underlying AWS infrastructure to run SAS Grid Manager that can also be applied to similar applications with demanding I/O requirements.

System Design Overview

Running high-performance workloads that use throughput heavily, with sensitivity to network latency, requires approaches outside of typical applications. AWS generally recommends that applications span multiple Availability Zones for high availability. In the case of latency sensitivity, high throughput applications traffic should be local for optimal performance. To maximize throughput:

  • Run in a virtual private cloud (VPC), using instance types that support enhanced networking
  • Run instances in the same Availability Zone (they can be in multiple subnets)
  • Run instances within a placement group (see Placement Groups)

The SAS GRID nodes in the cluster are i2.8xlarge instances. The 8xlarge instance size proportionally provides the best network performance to shared storage of any instance size, assuming minimal Amazon EBS traffic. In the case of the 8xlarge, both EBS and other network traffic goes onto the 10 gigabit network, whereas the capacity is split for smaller sizes. This lets you achieve disproportionately higher throughput on an 8xlarge when traffic is skewed since the full 10 gigabits of capacity can be used to access the shared Lustre file system. The i2 instance also provides high performance local storage, which is covered in more detail in the following section.

The choice of an 8xlarge size for the Lustre cluster has less of an impact than choosing 8xlarge for the SAS nodes because there is significant traffic to both EBS and the file system clients, although an 8xlarge is still more optimal. The Lustre file system has a caching strategy and you will see higher throughput to clients in the case of frequent cache hits, which effectively reduces the network traffic to EBS.

Intel provides several AWS CloudFormation templates for launching a Lustre cluster. It’s pretty amazing that you can have a file system as complex as Lustre running and available in 10 or 15 minutes using the template. The template creates a Lustre metadata server instance, Lustre management server instance, NAT instance, and one or more object storage service (OSS) instances. The data is served off the OSS nodes, making them the component that has the most influence over throughput.

The Intel Cloud Edition for Lustre solution adds specific features for AWS, such as automatically replacing failed instances. The Intel wiki has a lot of detail on the design of Lustre. We made a few adjustments for testing to the template available at ICEL – Global Support HVM: adding a placement group, and changing the type of instance for the OSS (storage server) to c3.8xlarge. The SAS® Grid® Manager 9.4 Testing on AWS using Intel® Lustre whitepaper outlines how to make the specific changes required.

Steps to Maximize Storage I/O Performance

SAS applications need high-speed, temporary storage. Typically, temporary storage has the most demanding load. The high I/O instance family I2, and the recently released dense storage instance D2, provide high aggregate throughput to ephemeral (local) storage. The i2.8xlarge has 6.4 TB of local SSD storage, while the D2 has 48 TB of HDD.

The SAS workload tested used the I2, although you should be able to achieve similar performance with the D2. Ephemeral storage is perfect for temporary space, or for use in systems that replicate data (like Hadoop’s file system, HDFS). SAS applications use two temporary storage spaces, SASWORK and UTILLOC. We created RAID 0 volumes for these two locations. One of the considerations when dealing with ephemeral storage is that you will lose it whenever you stop, then start an instance. We wanted to be able to shut down for cost savings, so we created a startup script that creates and mounts ephemeral RAIDs automatically on restart. For more information about the example scripts initdisk.sh and initeph.sh, see the SAS® Grid® Manager 9.4 Testing on AWS using Intel® Lustre whitepaper.

For permanent storage on the Lustre cluster, you need to decide what type of EBS volume to use. For this type of workload, the choice is between the General Purpose (SSD) and Provisioned IOPS (SSD) volume types. SAS workloads tend to be large block sequential operations; General Purpose SSD volumes work well at a lower cost than Provisioned IOPS volumes for this type of workload. The number of IOPs for General Purpose (SSD) is determined by the volume size (3 IOPs per GB) and it has a bursting capability over the allocated IOP limit. You can overprovision storage to achieve better performance and, for sequential large block access, it will likely be cheaper to provision more storage than to use smaller volumes and pay for IOPS using Provisioned IOPS (SSD).

A second consideration with EBS is the throughput of each volume. There is a limit of 160 megabytes per second for one volume. You need to create an adequate mix of storage size and volumes to achieve the total throughput needed. There is great documentation on describing bursting and performance at Amazon EBS Volume Performance on Linux Instances and Amazon EBS Volume Types.  We chose (8) 750 GB General Purpose (SSD) volumes per OSS for the testing.

Throughput Testing and Results

We want to achieve a throughput of least 100 MB/sec/core to temporary storage, and 50-75 MB/sec/core to shared storage. The i2.8xlarge has 16 cores (32 virtual CPUs, each virtual CPU is a hyperthread on a core, and a core has two hyperthreads). Doing the math, we needed to get per instance at least 1.6 gigabytes per second to temporary storage, and 800 megabytes per second to shared storage in order to use the compute power on the node fully, and not be I/O bound.

Testing results showed good processing efficiency and that the workload was not I/O bound. This means that the SAS workload was able to use the compute power fully for the 64 cores (4 i2.8xlarge instances x 16 cores per instance) under test.

Testing done with lower level testing tool (a SAS tool, iotest.sh)  showed a throughput of about 3 GB/sec to temporary storage, or almost 200 MB/sec/core, and about 1.5 GB/sec to shared storage, or about 25 MB/sec/core (because the shared file system is used by all 64 cores) in this configuration. The shared storage performance does not take into account file system caching, which Lustre does well. We started with the 1.5 GB/sec configuration for our functional setup, and then found that we did not need to increase the throughput of the Lustre cluster beyond that to achieve good overall performance.

Lustre could have been scaled up to achieve an uncached throughput performance of over 50 MB/sec/core by adding OSS nodes. In independent testing, Lustre scaled linearly using iotest.sh, doubling file system throughput to 3 GB/sec when increasing the number of OSS nodes to 6. For more information, see the Intel whitepaper Developing High-Performance, Scalable, cost effective storage solutions with Intel®Cloud Edition Lustre* and Amazon Web Services.

This testing demonstrates that, with the right design choices, you can run demanding compute and I/O applications on AWS. For full details of the testing configuration and results, see the SAS® Grid® Manager 9.4 Testing on AWS using Intel® Lustre whitepaper.

If you have questions or sugggestions, please leave a comment below.

——————————————————-

Related:

Launching and Running an Amazon EMR Cluster inside a VPC