AWS Cloud Operations & Migrations Blog

Using AWS Systems Manager OpsCenter and AWS Config for compliance monitoring

In this post, I show how AWS Systems Manager OpsCenter can be used to centrally record and mitigate alerts from AWS Config.  When AWS Config detects a resource that is out of compliance, an OpsItem is created.  This OpsItem is used to track details of the noncompliant resource, record investigative actions, and provide access to consistent remediation actions. It also provides a nonmutable record of all actions that can be used for audit purposes.

With OpsCenter, you have a central location to view current issues, a historical record of issues, and a list of actions taken to remedy the issue.  This is useful when you have policies and guardrails that you want multiple teams to follow and you need automation to scale the process.

What is AWS Config?

AWS Config is an AWS service that is used to assess, audit, and evaluate the configurations of your AWS resources. One of its popular features is the ability to define rules that continuously scan your AWS resources for compliance to your internal guidelines. When a noncompliant resource is detected, an alert is sent, typically to an Amazon Simple Notification Service (Amazon SNS) topic.

AWS Config allows you to remediate noncompliant resources using AWS Systems Manager Automation documents. These documents define the actions to be performed on noncompliant resources evaluated by AWS Config rules. AWS Config provides a set of managed automation documents or you can create your own document to meet operational requirements. To apply remediation on noncompliant resources, you choose the remediation action from a prepopulated list or select your own document. You can do this using the AWS Config console or AWS Config API operations.

Using an Automation document allows for consistent remediation of events. Take, for instance, the case of public S3 buckets. You can configure an AWS Config rule to scan for buckets that allow public reads. When it finds a bucket, it can invoke an Automation document to disable public read and raise an alert through Amazon SNS.

However, there are some scenarios where you might want to manually investigate a noncompliant resource. Imagine a scenario where you build and maintain a collection of Amazon Machine Images (AMI). Because these AMIs have been validated, you want only these AMIs to be used for Amazon Elastic Compute Cloud instances.  If someone creates an instance using an unauthorized AMI, you want to stop or shut down the instance and have the owner correct the issue.

In this scenario, you might want to investigate the situation before remediation (for example, if the server is part of an Auto Scaling group and someone has incorrectly configured the launch configuration).  Auto-termination would start the process of adding a new instance that uses unauthorized AMI. This new instance would be shut down automatically, the Auto Scaling group would provision a new resource, the auto-remediation would then shut down the instance, and a loop would be created.

In this scenario, manual intervention would identify the situation and prevent the loop from occurring. You might also have a production server outside an Auto Scaling group and auto-terminating the instance would cause problems with end users.

This post describes how to create an AWS Config rule to identify EC2 instances that are using unauthorized AMIs. I also show how OpsCenter can be used to track, investigate, and remediate the issue.

 

Creating an AWS Config rule

To get started, open the AWS Config console. If this is your first-time using AWS Config in your Region, choose the Get Started button.  If you have already used AWS Config, from the left pane, choose Rules, and then choose Add rule.

Rules page shows that no rules found. The page includes a button for adding a rule.

Figure 1: Rules page of the AWS Config console

 

AWS Config creates and maintains a set of rules that meet common scenarios and risks that customers experience on a regular basis.  If you have a specific scenario that is not covered by an AWS managed rule, you can create your own custom rule.

Select rule type

In the AWS Config console, on Specify rule type, select Add AWS managed rule. In the search box, enter approved, select approved-amis-by-id, and then choose Next.

The selections are filtered by approved. There are two options: approved-amis-by-id and approved-amis-by-tag.

Figure 2: Add rule type page of the AWS Config console.

 

Customize your rule

On the next page, you can enter a name and description for the rule.

The rule name is approved-amis-by-id. The description is “Checks whether running instances are using specified AMIs. Specify a list of approved AMI IDs. Running instances with AMIs that are not on this list are noncompliant.”

Figure 3: The Customize rule section with Name and Description boxes

 

In the Trigger section, specify details for the rule, including trigger type, scope of changes, and resources. Because this rule applies to EC2 instances only, I’ve specified that the rule only checks EC2 instances. In addition, the rule runs only when the EC2 configuration changes.

Under Trigger type, Configuration changes is selected. Under Scope of changes, Resources is selected. Under Resources, EC2 Instance has been added. Resource identifier has been left blank.

Figure 4: Trigger section of the AWS Config console.

 

Next, enter the list of authorized AMI IDs. You can enter multiple AMIs by using a comma-separated list.

Under Key, amiIds is entered. Under Value, ami-0962330670840f9b8 is entered.

Figure 5: Parameters section of the AWS Config console

 

Finally, leave Auto remediation to No as remediation is done through the OpsItem.

The Remediation action is set to Remediation action, Auto remediation is set to No, Rate Limits are not specified, Resource ID parameter is set to n/a.

Figure 6: Remediation action for AWS Config rule

 

Viewing a list of rules

You can find that the rule has been created. After a few minutes, the console displays its compliance status.

Under Rules, the Compliance column shows that the approved-amis-by-id rule has three noncompliant resources.

Figure 7: Rules page of the AWS Config console.

 

Create an Amazon EventBridge rule

The next step is to use Amazon EventBridge to monitor AWS Config for noncompliant resources.  Amazon EventBridge is a successor to Amazon CloudWatch Events and provides a near-real time system events stream from many AWS services and SaaS applications. You create an Amazon EventBridge rule that connects to a specified source system and receives an event. The event is transformed before delivery to the target system.

Amazon EventBridge provisions all required resources for communication, filters and transforms the event, and provides at-least-once delivery to the target system. It typically takes half a second for Amazon EventBridge to receive an event and transmit it to the target system.

I show you how to create an Amazon EventBridge rule that receives events from AWS Config, transforms them to OpsItems, and transmits the events to OpsCenter as the destination target.  This sounds complex, but is simple to configure.  The first step is to open the Amazon EventBridge console. Choose Create Rule, and then enter a rule name and description.

The rule name is Config-Authorized-EC2-Instances. The description is “Reports on EC2 instances that are using unauthorized AMIs.”

Figure 8: Create rule page in the Amazon EventBridge console.

 

Specify event pattern

In Define Pattern, select Pre-defined pattern by service. Under Service Provider, choose AWS.

On Define pattern, Event pattern is selected. Under Event matching pattern, Pre-defined pattern by service is selected. Under Service provider, AWS is selected.

Figure 9: Define pattern page in the Amazon EventBridge console.

 

Filter incoming events

AWS Config creates a large number of events and without specifying filters, you can quickly be overwhelmed by a large number of OpsItems. The console allows you to filter by message type, rule name, resource type, and specific resource. As you choose these options, you find the console builds an event pattern that is used to filter the incoming events.

Figure 10 shows the options for processing a rule. Under Event type, Config Rules Compliance Change is selected for a rule named approved-amis-by-id. Any resource type and Any resource ID are selected. (The resource, in this case, is an EC2 instance.)

Filtering selections for the AWS Config rule named approved-amis-by-id. Under Event matching pattern, Pre-defined pattern by service is selected. Under Event type, Config Rules Compliance Change is selected. The Any resource type and Any resource ID options are also selected.

Figure 10: Detailed filtering for an Amazon EventBridge rule

 

Build a custom filter

If you manually build the event pattern, then you can do interesting filtering that isn’t possible with the default options available in the console.  The following example only accepts events from the AWS Config rule named approved-amis-by-id where the EC2 instance is NON_COMPLIANT.  This creates OpsItems for noncompliant resources and filter out events for compliant resources.

{
    "source": [
        "aws.config"
    ],
    "detail-type": [
        "Config Rules Compliance Change"
    ],
    "detail": {
        "configRuleName": [
            "approved-amis-by-id"
        ],
        "newEvaluationResult": {
            "complianceType": [
                "NON_COMPLIANT"
            ]
        }
    }
}

Create an input transformer for the OpsItem target

The Amazon EventBus transforms the incoming event into an outgoing event using default mappings.  However, in some situations you may want to create your own mapping using an input transformer.

By using the input transformer, you can take advantage of the deduplication logic of the CreateOpsItem API. If you specify a deduplication string, then the built-in logic creates and stores a hash based on the deduplication string and resource that triggered the OpsItem. If a matching hash is found, then a new OpsItem is not created. By enabling this feature, you can only have a single OpsItem for each noncompliant instance.

Under Select event bus, choose AWS default event bus. Under Select target, choose SSM OpsItem. Select the Create a new role for this specific resource option.

Under Select event bus, AWS default event bus is selected. Under Target, SSM OpsItem is selected. The Create a new role for this specific resource is selected. The resource is Amazon_EventBridge_Invoke_Create_Ops_Item_910185980.

Figure 11: Selecting targets for the Amazon EventBridge rule

 

If you expand the area under Target, you can use the boxes under Input transformer to specify the input path and the input template.

On the Select targets page, Input transformer is selected. There are boxes for entering the input path and input template.

Figure 12: Select targets page where you enter the input path and input template

 

How to transform the event

Input path is where you reference the elements from the original event. The input template is where you specify the elements of the new event. In this case, the input path references elements of the Config element and the input template includes elements for the OpsItem. If you are interested in extending these examples, you can read more about how to transform target input.

The following text is used for the input path:

{
    "detail":"$.detail",
    "resources":"$.resources",
    "resourceType": "$.detail.resourceType",
    "resourceId": "$.detail.resourceId",
    "configRuleName": "$.detail.configRuleName",
    "complianceType": "$.detail.newEvaluationResult.complianceType"
}

The following text is used for the input template. It creates an OpsItem with a priority of 2 and severity of 1. Here are some other elements to note:

  • The OperationalData element adds some interesting elements to the OpsItem.
  • The /aws/automations element allows you to associate pre-existing runbooks with this OpsItem.
  • The dedupe process creates a hash using the /aws/dedup string and the resourceId, which is the EC2 instance ID.
  • The other elements of OperationalData are used to provide context information for this rule.
{
    "title":"EC2 Instance is running an unauthorized AMI",
    "description":"An AWS Config rule detected that an EC2 instance is running an unauthorized AMI.",
    "source":"Config Compliance",
    "category":"Availability",
    "priority":"2",
    "severity":"1",
    "resources": <resources>,
    "detail": <detail>,
    "operationalData":{
        "/aws/automations":{
            "value":"[ { \"automationType\": \"AWS:SSM:Automation\", \"automationId\": \"AWS-TerminateEC2Instance\" }, { \"automationType\": \"AWS:SSM:Automation\", \"automationId\": \"AWS-StopEC2Instance\" } ]"
        },
        "/aws/dedup":{
            "type":"SearchableString",
            "value":"{\"dedupString\":\"Config-Compliance-EC2-Authorized-AMI\"}"
        },
        "complianceType": {"type": "SearchableString", "value": <complianceType>},
        "configRuleName": {"type": "SearchableString","value": <configRuleName>},
        "resourceType": {"type": "SearchableString","value": <resourceType>},
        "resourceId": {"type": "SearchableString","value": <resourceId>}
    }
}

 

Create an Amazon EventBridge rule

Under Input transformer, paste the values into the box and create the rule.

The box under Input transformer contains values for the EC2 instance is running an unauthorized AMI rule. These are values for title, description, source, category, priority, severity, resources, detail, and operationalData.

Figure 13: Detailed transformation for the Amazon EventBridge rule

 

Congratulations!

You have set up AWS Config to scan for EC2 instances that are using unauthorized AMIs. When AWS Config detects a noncompliant instance, it creates an event that is received by Amazon EventBridge. An Amazon EventBridge rule creates and sends an OpsItem to OpsCenter. The system dedupes the events, which result in you having a single open OpsItem for each noncompliant instance.

Using OpsCenter in AWS Systems Manager

OpsCenter in AWS Systems Manager helps you to view, investigate, and resolve operational issues related to your AWS and hybrid cloud deployments. OpsCenter uses OpsItems to present operational issues in a standardized view. The OpsItems provide operational details and contextual information to help you quickly diagnose and remediate the source issue.

Under OpsItems by source and age, Config Compliance displays two OpsItems created in the last 0-30 days.

Figure 14: OpsItems by source and age section of the AWS Systems Manager console

 

Viewing OpsItem details

On Figure 15, you find the OpsItem that is created for noncompliant EC2 instances. The Overview tab displays the OpsItem details. It includes elements in the input template. Note the values displayed for Deduplication string, Priority, and Severity.  It is not shown on the console, but OpsCenter uses the Deduplication string plus the EC2 instance ID to create the dedupe hash.

The OpsItem details include the description, OpsItem ID, title, status, source, created and last updated dates, deduplication string, severity, priority, and category.

Figure 15: View details of OpsItem for noncompliant EC2 instance

 

The Related resources section displays details about the AWS resource that triggered the AWS Config rule. As expected, there is an EC2 instance. The second entry under Resource ARN is the AWS Config rule. You can choose the resource ARN to view details such as Amazon CloudWatch metrics. You can also navigate to the Amazon EC2 console to view instance details.

Under Resource ARN, there are entries for a noncompliant EC2 instance and the triggering AWS Config rule.

Figure 16: Related resources section of the console

 

Scrolling down, you can find a list of Automation runbooks that allow you to take consistent actions on AWS resources. You can find the runbooks that were specified in the input template.  You can use runbooks created by AWS or create your own to meet your operational requirements.

The runbooks specified in the input transformer, AWS-TerminateEC2Instance, and AWS-StopEC2Instance, appear in the Runbooks section of the console.

Figure 17: Runbooks section of the console

 

The Operational data section provides context information on the item and includes elements that were specified in the input transformer. Under complianceType, you can find that the EC2 instance is NON_COMPLIANT.

The Operational data section shows complainceType (NON_COMPLIANT), configRuleName (approved-amis-by-id), resourceId, and resourceType (AWS::EC2::Instance).

Figure 18: Operational detail added to OpsItem by Amazon EventBridge transformer

 

Investigate an issue

To investigate, choose the Related resource link for the instance and open Resource description.  You can find that an unauthorized AMI is being used and that the instance is running.

Related resource details tab for the EC2 instance running an unauthorized AMI includes the instance ID, instance type, architecture, Availability Zone, image ID, and state.

Figure 19: Details for the noncompliant EC2 instance

 

Remediating an issue

Now that you’ve confirmed the noncompliance of the instance and viewed other relevant details, you are ready to act. The action is to use a predefined runbook to turn off the instance. Using automation helps avoid manual errors and inconsistencies in remediation efforts.

Return to the Overview section where you can find a predefined runbook named AWS-StopEC2Instance. Select the runbook, and then choose Execute.

The AWS-StopEC2Instance runbook is selected on the Runbooks page of the console.

Figure 20: Runbooks page of the console

 

To check the progress of the runbook, choose the latest result. In addition to execution details, there is also a Save to operational data button. This provides an audit trail of remediation activity.

Under Latest automation results for AWS-StopEC2Instance, there is a summary of runbook status. The summary shows execution start and end times, execution status (Success), and runtimes.

Figure 21: Latest automation results for AWS-StopEC2Instance

 

Confirming the issue has been remediated

After the execution of the runbook is complete, open Related Resource. Under State, you can find that stopped is displayed.

Resource description section shows the instance ID, Availability Zone, image ID, instance type, state (stopped), and architecture.

Figure 22: Resource description details

 

Conclusion

You have learned how to detect noncompliant resources in your AWS environment and now have a process for consistently investigating and remediating risks. I showed you how to create an AWS Config rule to identify EC2 instances that are using unauthorized AMIs. I also showed you how to create an Amazon EventBridge rule that receives events from AWS Config. These are transformed to OpsItems, and transmits the events to OpsCenter as the destination target.

For more information about AWS Config, check AWS Config best practices. For more information about using Automation documents for operational tasks, check creating Automation documents that run scripts in the AWS Systems Manager documentation.

 

About the author

Author photoMichael Heyd is a Solutions Architect with Amazon Web Services and is based in Vancouver, Canada.  Michael works with enterprise AWS customers to transform their business through innovative use of cloud technologies.  Outside work he enjoys board games and biking.