AWS Cloud Operations & Migrations Blog

New features of Run Command: Copy to new, rerun, and CloudWatch Metrics

In this blog post, I cover new features of AWS Systems Manger Run Command that make deploying and testing automation at scale easier. AWS Systems Manager is a great platform to simplify the task of managing infrastructure at scale. One of the key features of this platform is Run Command, which enables automation of common administrative tasks, and also the ability to perform configuration changes at scale.

As a quick review of how Run Command works, you can execute AWS Systems Manager Documents that perform tasks on target instances; you have full control of the automation executed. AWS Systems Manager also provides logging and auditability features that make the process robust and simple. Reference the Run Command documentation to fully understand how this process works.

In this article, I will cover three new features in AWS Systems Manager Run Command that make configuration management automation much simpler. These are:

  • Copy To New
  • Re-Run
  • CloudWatch Metrics for Run Command

Let’s go over each feature and what it can do.

Copy To New – This feature allows you to select a Run Command execution and copy all of the details of the execution, including all parameters and options, to a new execution. Once copied, you can modify the options, before submitting the command for execution on the target instances.

Re-Run – This feature allows you to select a Run Command execution from the commands list or command history, and run the same command without modifications. You are prompted to confirm the execution.

CloudWatch Metrics for Run Command – This feature publishes Run Command execution status information to CloudWatch, as a metric data point. This allows you to track success and failure of your command executions. It also makes it possible to setup Alarms based on failures.

Let’s walk through a scenario that demonstrates how all these features work. For the purpose of this example, I’m developing a new automation script to set up a development environment. I must install and configure Apache in a large set of actively running EC2 instances. After doing some research, I figured out that all I need to do is run the following shell commands on all the target instances.

sudo yum update -y

sudo yum install -y httpd

sudo systemctl start httpd

sudo systemctl enable httpd

To accomplish this using AWS Systems Manager, I can use the AWS-RunShellScript Document to execute this command. However, this post does not cover all the steps on how to do that. Instructions how to complete those steps can be found in the Remotely Run Commands on an EC2 Instance article.

Typically, when I’m creating new automation for Run Command, I first target the execution on a single test instance to make sure that everything is working as expected. After running the command on one instance, you will see the result of that execution on the Commands History list. To access that list, I can open the AWS Management Console by clicking on Services, then go to Systems Manager. In the Left pane I can click on Run Command, then on the right window click on Command History. In this screen I can see a list of all the Run Commands executed in an account.

Great, it looks like the test on one target instance executed successfully. That means we can move on to executing the task on the rest of the instances. Before the release of the Copy To New feature, I would have to start a new Run Command execution and start specifying all the parameters, including the new set of target instances. This may not be a significant issue for simple script with few parameters. But it gets more difficult when you have a complex set of parameters. Now with the Copy To New feature, you can save time because all the parameters are pre-populated for you.

To use Copy to New, click on the radio button on the left Command ID that was successfully executed, then click on the Copy to new button. This displays the Run Command screen with all the same parameters pre-populated. For the next step, you would just need to change the target instances that will execute the command.

In my example, all instances are tagged with the following key-value pair Environment=Development. Since I now want to target all the rest of the instances, after clicking Copy To New, I’m going to define the Targets in the Run Command execution by selecting the option “Specify instance tags” in the targets section. Then, I click “Run” to complete the process. By using the Copy To New feature, I’ve simplified the process of executing automation at scale.

Now to demonstrate how the Re-Run feature works, lets continue with our sample scenario. Let’s say the environment with all of its EC2 instances was rebuilt, and now you have 30 fresh instances, they are still tagged the same way. Now, you must execute the same set of commands to install Apache on them again. To do this, you could go through the process of creating a new Run Command execution and defining all the parameters. You can even use Copy To New feature to simplify the process. However, since everything is the same, including the targets specified by tags, you can use the new Re-Run feature.

To use Re-Run, on the AWS Systems Manager console, on the left pane, I click on Run Command the look for the previous execution by clicking on Command History and locating it there. Then I select it by clicking on the radial button on the left of the command item.

Now, you can Click on the “Rerun” button to execute the same automation. After clicking on the “Rerun” button, you will be prompted to confirm that you want to run the command again. This triggers the execution using all the same parameters and the same targets defined by the tags. As you can see, this new approach greatly simplifies targeting this newly rebuilt environment.

The last feature I want to cover is the CloudWatch Metrics for Run Command feature, which is important to detect failures of the execution, especially when you are executing Run Command on a large set of instances. For example, a large set of failures could indicate problems with the automation or the target instances. This new feature tracks the success or failure of your Run Command executions, and sends the results as metric data to CloudWatch, which is a managed metrics and logging service. Once the data is there, you can use the Run Command metrics to create dashboards, you generate alarms, you detect anomalies, and even trigger automation based on an alarm.

To look at the metrics exposed by Run Command, you can do the following. Go into the Amazon CloudWatch console, on the left pane click on “Metrics”, then on the right pane locate the “SSM Run Command” link and click on it. Then, click on “Across All Commands”. Now you are able to select from any of the following available metrics:

  • CommandsDeliveryTimedOut
  • CommandsFailed
  • CommandsSucceeded

For the purposes of our example, we could create an Alarm if the CommandsFailed metric goes above 1. We could then use that alarm to either send an email or trigger automation to address the root cause issue of the failure. Reference Amazon CloudWatch documentation for details on how to create alarms.

To summarize, we’ve considered three new features of AWS Systems Manager that simplify deployment at scale. These new features make it easy to execute automation quickly and also track the success of automation execution. To get started with the new Run Command features, go to the Run Command documentation.

About the Author

Andres Silva is a Principal Technical Account Manager for AWS Enterprise Support. He has been working with AWS technology for more than 9 years. Andres works with Enterprise customers to design, implement and support complex cloud infrastructures. When he is not building cloud automation he enjoys skateboarding with his 2 kids.