Create Custom AMIs and Push Updates to a Running Amazon EMR Cluster Using Amazon EC2 Systems Manager

Amazon EMR lets you have complete control over your cluster, giving you the flexibility to customize a cluster and install additional applications easily. EMR customers often use bootstrap actions to install and configure custom software in a cluster. However, bootstrap actions only run during the cluster or node startup. This makes it difficult for you to make configuration changes after a cluster is already running.

EMR clusters can also use a custom Amazon Machine Image (AMI). With the new support for launching clusters with custom Amazon Linux AMIs, customizing an EMR cluster is now even easier. However, the task of creating and managing custom AMIs can become increasingly difficult as the number of AMIs in your environment starts to increase.

Amazon EC2 Systems Manager helps you automate various management tasks such as automating AMI creation or running a command or script across hundreds of instances. In this post, I show how Systems Manager Automation can be used to automate the creation and patching of custom Amazon Linux AMIs for EMR.

Systems Manager Run Command lets you remotely manage the configuration of Amazon EC2 instances or on-premises machines. Run Command can be used to help you perform the following types of tasks on your EMR cluster nodes: install applications, restart daemons (HDFS, YARN, Presto, etc.), and make configuration changes. I also show how you can use Run Command to send commands to all nodes of a running EMR cluster.

Benefits of using a custom AMI

Although you can easily customize an EMR cluster using bootstrap actions, there can be benefits to using a custom AMI.

- Reduction of cluster start time
  There are certain scenarios where a bootstrap action may affect your cluster start time. For example, your bootstrap action could be doing something like downloading a large program over the internet and delaying the time for your cluster to be ready. By adding and installing a program directly in the AMI, the time to complete a cluster launch may be reduced.
- Prevent unexpected bootstrap action failures
  There are also scenarios where installing and configuring custom software directly in the AMI reduces the risk of unexpected failures. For example, a mirror or repo used by your bootstrap action to download a program might be offline or inaccessible. This could cause your bootstrap action to fail, which could cause a cluster launch failure.
- Support for Amazon EBS root volume encryption
  A number of security and encryption features are available with EMR security configurations. This includes the ability to encrypt data at rest for HDFS (local volumes/Amazon EBS) and Amazon S3. However, certain regulatory/compliance policies may require that the root (boot) volume is also encrypted. By bringing your own Amazon Linux AMI, you can create AMIs that use encrypted EBS root volumes and use those AMIs for your EMR clusters.

Bring your own AMI requirements

Custom AMIs for EMR must meet the following requirements:

- - Must be an Amazon Linux AMI
  - Must be an HVM AMI
  - Must be an EBS-backed AMI
  - Must not have multiple EBS volumes
  - Must be a 64-bit AMI
  - Must not have users with the same name as applications (example: hadoop, hdfs, yarn, or spark)

It is not necessary for you to own the custom AMI, but your service role must have launch permissions. Therefore, the AMI should be one of the following:

- - Owned by you
  - A public AMI
  - Shared with you by its owner

For best practices and considerations for EMR custom AMIs, see Using a Custom AMI.

Walkthrough

For the examples in this post, I show how you can set up the following solutions:

- - Automate a workflow of creating custom AMIs with pre-installed software
  - Run commands or make application configuration changes on all nodes of a running EMR cluster

Before you begin

In this post, the AWS CLI is used to execute the examples and steps shown. However, having the AWS CLI installed is not a requirement and the AWS Management Console can be used to perform the same tasks.

The region used for the examples is us-east-1 (N. Virginia).

Building a custom AMI with Systems Manager Automation

In this section, I show how you can use Automation to create a custom AMI. The following diagram shows an overview of the actions that the Automation will perform:

1) Configure roles for Automation

Before getting started, you have to configure an IAM instance profile role and a service role that Automation can use. The instance profile role gives Automation permission to perform actions on your instances, such as executing commands or starting and stopping services. The service role (or assume role) gives Automation permissions to perform actions on your behalf.

Configuring the required IAM roles for Automation is usually one of the hardest parts of setting up Automation. Luckily, you only do this step one time. We also have an AWS CloudFormation template that can be used to create and configure the required roles for Automation. For more information, see Method 1: Using AWS CloudFormation to Configure Roles for Automation.

To manually configure the roles for Automation, see Using IAM to Configure Roles for Automation.

2) Create a custom Automation document

An Automation document defines the actions that Systems Manager performs. In this step, you create a custom Automation document (customEmrAmiDocument) that performs the following steps:

- - Launch an EC2 instance from a base Amazon Linux AMI
  - Update installed software on the instance
  - Run additional Linux commands (optional)
  - Shut down the instance
  - Create an AMI of the instance
  - Terminate the instance

To create a custom Automation document, first download the customEmrAmiDocument.json document to your local machine. You can then use the console, AWS CLI, or AWS SDKs to create (upload) that Automation document in your account. The following example shows how to create an Automation document called “customEmrAmiDocument” using the AWS CLI:

$ aws ssm create-document --name "customEmrAmiDocument" --content file:///<PATH_TO>/customEmrAmiDocument.json --document-type Automation --region us-east-1

Note: Creating an Automation document does not cause that document to be executed. You execute this document in the next step. Also note that file:// must be referenced followed by the path of the content file.

For more information, see Creating an Automation Document.

3) Executing the custom Automation document

The “customEmrAmiDocument” Automation document created in the previous step has a list of parameters (SourceAmiId, InstanceIamRole, etc.), along with the description of each parameter. To describe the document parameters, run the following command:

$ aws ssm describe-document --name customEmrAmiDocument --query "Document.Parameters" --region us-east-1

The preceding command returns an output similar to the following:

[
    {
        "Type": "String",
        "Description": "(Required) The source Amazon Machine Image ID.",
        "Name": "SourceAmiId"
    },
    {
        "Type": "String",
        "Description": "(Required) The name of the role that enables Systems Manager (SSM) to manage the instance.",
        "DefaultValue": "ManagedInstanceProfile",
        "Name": "InstanceIamRole"
    },
…

When you start an Automation execution, you must pass the required parameters (SourceAmiId) along with any additional parameters for which you would like to overwrite the default value. For example, if you used CloudFormation to create the required IAM roles, you do not need to specify the InstanceRole and AutomationAssumeRole parameters.

To execute the document without including the InstanceRole and AutomationAssumeRole parameters, run the following command:

aws ssm start-automation-execution --document-name "customEmrAmiDocument" --parameters "SourceAmiId=<AMI_ID>, CustomCommands=[<List_of_linux_commands_to_run>]" --region us-east-1

If your role names or ARNs have different values than the defaults, make sure that you specify those parameters accordingly. For example, if your instance profile/role is called “MyManagedInstanceProfile” and the Automation service role ARN is “arn:aws:iam::012345678910:role/MyAutomationServiceRole”, then your parameters to execute the Automation should be similar to the following:

--parameters "SourceAmiId=<AMI_ID>, InstanceIamRole=MyManagedInstanceProfile, AutomationAssumeRole=arn:aws:iam::<ACCOUNT_ID>:role/MyAutomationServiceRole, InstanceType=<Instance_Type>, CustomCommands=[<List_of_linux_commands_to_run>]"

To start an Automation execution that creates a custom Amazon Linux AMI with Python 3.5 and additional Python libraries (boto3) installed, use the following command:

aws ssm start-automation-execution --document-name "customEmrAmiDocument" --parameters "SourceAmiId=ami-4fffc834, InstanceIamRole=<INSTANCE_PROFILE_NAME>, AutomationAssumeRole=arn:aws:iam:: <ACCOUNT_ID>:role/<AUTOMATION_SERVICE_ROLE_NAME>, InstanceType=m3.large, CustomCommands=[yum -y install python35-devel python35-pip, /usr/bin/python35 -m pip install boto3]" --region us-east-1

I chose “ami-4fffc834” for the SourceAmiId parameter because it’s the latest Amazon Linux AMI in the us-east-1 (N. Virginia) region at the time of publication. It also has all the requirements needed for EMR custom AMIs. If you’re running your Automation document in a different region, set the SourceAmiId parameter to an AMI that’s available in that particular region (ex: “ami-aa5ebdd2” for us-west-2).

4) Finding details about the Automation execution

After the Automation execution is complete, you can view the steps that were executed in addition to the status of each step and their output. To view all Automation executions that used the “customEmrAmiDocument” document, you can run the following command:

$ aws ssm describe-automation-executions --query 'AutomationExecutionMetadataList[?DocumentName==`customEmrAmiDocument`]' --region us-east-1

To gather detailed information on a particular Automation execution, use the AutomationExecutionId parameter value returned in the preceding command:

$ aws ssm get-automation-execution --region us-east-1 --automation-execution-id <AutomationExecutionId>

The output of the preceding command contains details about each step executed by the Automation execution. To easily find the AMI ID/imageID of the AMI created during the Automation createImage step, run the following command:

$ aws ssm get-automation-execution --region us-east-1 --automation-execution-id <AutomationExecutionId> --query 'AutomationExecution.StepExecutions[?StepName==`createImage`]'

If the Automation execution fails or stops before reaching the final instance-termination step, you might need to stop instances manually or disable services that were started during the Automation execution. See the Automation CLI walkthrough and the troubleshooting Systems Manager Automation guide for more information.

5) Launch an EMR cluster with a custom AMI

After completing the preceding steps, you should now have a custom Amazon Linux AMI that can be used for EMR. For more information, see Using a Custom AMI.

The following command can be used to launch an EMR cluster via the AWS CLI:

$ aws emr create-cluster --name "Cluster with My Custom AMI" --custom-ami-id <custom_AMI_ID> --ebs-root-volume-size 20 --release-label emr-5.8.0 --use-default-roles --instance-count 2 --instance-type m3.xlarge --ec2-attributes KeyName=<Your_ssh_key> --region us-east-1

For information about how to find the AMI ID of the custom AMI created by Automation, see step 4.

Using Run Command with EMR

In this section, I show how you can use Run Command to send commands to the nodes of a running EMR cluster. The following diagram shows an overview of a Run Command execution:

1) Configure the instance IAM role for Systems Manager

EC2 instances (EMR cluster nodes) need an IAM role to be able to communicate with the Systems Manager API. Because EMR already assigns an IAM role (usually called EMR_EC2_DefaultRole) to each cluster node, you can attach an additional managed policy (Systems Manager policy) to that role.

The following command attaches the “AmazonEC2RoleforSSM” managed policy to the EMR_EC2_DefaultRole role:

$ aws iam attach-role-policy --policy-arn arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforSSM --role-name EMR_EC2_DefaultRole

If you’re not using the default EC2 role, replace the –role-name parameter value with the role name that you’re using for your role.

For more information about configuring IAM roles and policies for Systems Manager, see Configuring Security Roles for Systems Manager.

2) Install the SSM Agent

Skip this step if your custom AMI was created by Automation. The customEmrAmiDocument Automation document that you used to create the custom AMI installs the SSM agent by default.

The Systems Manager (SSM) agent is used to process System Manager requests and configure your instances as specified in the request. For more information, see Installing SSM Agent on Linux.

3) Running a command with Run Command

You should now be able to run commands or Linux scripts on the instances that have the SSM agent running and the IAM role for SSM configured (Step 1 in this section). To view a list of instances that are ready to receive commands, run the following command:

$ aws ssm describe-instance-information --output json --query "InstanceInformationList[*]" --region us-east-1

The easiest way to send a command to all cluster nodes is by using a resource tag as the target for Run Command. If you didn’t add any tags to your EMR cluster during launch, you can add tags using the following command:

$ aws emr add-tags --resource-id <EMR_CLUSTER_ID> --tags environment="emr-ssm" --region us-east-1

The preceding command adds a tag to an EMR cluster. The key of the tag is “environment” and the value is “emr-ssm”. You can now send a command using the tags as the target:

$ aws ssm send-command --document-name "AWS-RunShellScript" --targets '{"Key":"tag:environment","Values":["emr-ssm"]}' --parameters '{"commands":["hostname -f","python3 -V"]}' --timeout-seconds 60 --region us-east-1

The preceding command is sent (executed) to all EC2 instances that have the following tags: environment=”emr-ssm”.

4) Finding details on a Run Command execution

For the Run Command (send-command) that was executed in the previous step, Run Command is executing a command to show the hostname (hostname -f) of an instance and its Python 3 version (python3 -V).

After executing the Run Command (send-command), it should return a “CommandID” field in the output. You can use that command ID to gather information on the instances that the command was sent to and to view the status of the command execution:

$ aws ssm list-command-invocations --command-id <command_id> --region us-east-1

You can also view the output of the commands (‘hostname -f’ and ‘python3 -V’ for our example) that were executed by Run Command in a specific EC2 instance:

$ aws ssm get-command-invocation --command-id <command_id> --instance-id <instance_id> --query "StandardOutputContent" --region us-east-1

The preceding command returns something similar to the following:

"ip-xxxxxxxxxx\nPython 3.5.1\n"

For more information about running commands and shell scripts with Run Command, see Systems Manager Run Command Walkthrough.

Conclusion

This post showed you some of the benefits of using custom AMIs for Amazon EMR and how you can use Automation to automate the management and creation of custom AMIs. I also showed how Run Command can be used to send commands and make configuration changes on all nodes of a running EMR cluster.

If you have questions or suggestions, please comment below.

Additional Reading

Learn how to run Jupyter Notebook and JupyterHub on Amazon EMR.

About the Author

Bruno Faria is an EMR Solution Architect with AWS. He works with our customers to provide them architectural guidance for running complex applications on Amazon EMR. In his spare time, he enjoys spending time with his family and learning about new big data solutions.

AWS Big Data Blog