AWS Cloud Operations & Migrations Blog

Increase visibility and governance on cloud with AWS Cloud Operations services – Part 2

Introduction

This blog post is a continuation of Part 1. To recap, as your organization adopts AWS, you will likely leverage multi-account architectures to meet your requirements. We introduced some foundational patterns to prepare the environments for centralized operations and governance using AWS Cloud Operations services. In this blog (Part 2), we will show you how to centrally manage, visualize and report on operational tasks such as patching, mandatory software compliance and backups. Below is an architectural representation of what we will create.

Architecture Overview

Managing the security of AWS environments is critical to ensuring that all workloads are protected against vulnerabilities and threats. AWS Systems Manager Patch Manager is a fully managed service that helps you automate patching of AWS EC2 instances and on premises servers at scale. With Patch Manager, you can deploy operating system patches, software updates and security patches across your entire fleet. Patch policies, a feature of Patch Manager, provides a highly simplified and centralized setup to control patching operations across your entire multi-account multi-region AWS Organizations.

We will be using Quick Setup Host Management to set up and automate many host management activities across all Amazon EC2 instances in your AWS Organizations. This includes monitoring Amazon EC2 instance health, scanning of missing patches, collecting a detailed inventory of instance status every 30 minutes (including AWS drivers and agents, applications, OS information, network configuration, services and status, and Windows roles and updates).

Figure 1: Architecture diagram for backup and patching centralization and visualization

Figure 1: Architecture diagram for backup and patching centralization and visualization

In this section, we will cover how to set up a patch policy and enable host management across your AWS Organizations. We will show you how to aggregate patching, compliance, and inventory data for visualization and reporting. We will create a centralized Amazon Simple Storage Service (Amazon S3) bucket and use AWS Systems Manager resource data sync, AWS Glue and Amazon Athena to feed the data into Amazon QuickSight for visualizing and reporting operational metrics such as overall patch compliance, non-compliant instances, instances with missing critical patches. We will also cover how to report mandatory software compliance (including application version and service status) across all EC2 instances in your organization.

Prerequisites

Walk-through

We encourage you to perform all the following steps in the same region.

Setup AWS host management and configure resource data sync

1. Create a patch policy and enable host management across your AWS Organizations

  • Setup a patch policy using Automate organization-wide patching using a Quick Setup Patch Policy
  • Set up AWS Systems Manager host management to aggregate the EC2 instance meta data across the entire AWS Organizations using Host Management Quick Setup Configuration

Note: Uncheck “Scan instances for missing patches daily” while setting up Host Management Quick setup configuration.

2. Configure the AWS Systems Manager resource data sync for inventory to a centralized S3 bucket using link

Create an AWS Glue Crawler

  1. Set up AWS Glue Crawler to generate the inventory data catalog by visiting the AWS Glue console, on the left-side menu, choose Crawlers. On the Crawlers page, choose Create crawler. This starts a series of pages that prompt you for the crawler details.
  2. In the Crawler name field, write Instances-inventory and hit Next.
  3. On the next window Choose data sources and classifiers, select Add a Data Source.
  4. Select Data Source to be S3
  5. Select the Location of S3 data In this Account
  6. Select S3 bucket path to be s3://<centralized-s3-bucket-name> from step 2.
  7. Leave all other settings as default and select Add an S3 data source. Select Next.
  8. On the next window Configure security settings, Select Create IAM Role and name it AWSGlueServiceRole-instances-inventory. Ensure the newly created role is selected and select Next.
  9. On the next window, set output and scheduling, Select Add Database. This will open a new window. Name the database instances-inventory-database and select Create Database.
  10. Return to the previous window and select the newly created database as the Target database. Under Crawler schedule, select Frequency to Daily, set the Start hour to 12 and hit Next.
  11. Review the settings and select Create Crawler.
  12. On the last window Review and create, review the settings and Select Create Crawler.
  13. Once the crawler is created, run it for the first time by selecting Run Crawler to push the tables to the Data Catalog database.

Custom setting for Glue Database:

The AWS:InstanceInformation table includes a column named resourcetype, which is also a partition key, which causes Amazon Athena queries to fail. The following resources include an IAM role, an AWS Lambda function, an Amazon CloudWatch Event rule, and a Lambda permission. The CloudWatch Event rule is triggered by the Glue crawler execution, which then invokes the Lambda function to delete the column.

In order to automatically delete the column from table, deploy this CloudFormation template by following the steps below:

  1. Go to this link to download CloudFormation template.
  2. Navigate to the AWS CloudFormation console, select Stacks on the left panel, select Create Stacks, select Upload a template file and select the file downloaded in Step1
  3. Type in the stack name as glue-table-column-deletion and leave the default parameters, choose Next.
  4. On the Configure stack options page, add any required tags, and then choose Next
  5. Review all the information, select I acknowledge that AWS CloudFormation might create IAM resources with custom names, and then choose Create Stack to submit your stack configuration.

After the page is refreshed, the status of your stack should be CREATE_IN_PROGRESS. When the status changes to CREATE_COMPLETE, proceed to the next section.

Set up Amazon QuickSight data sources

  1. Go to Amazon QuickSight console, select Datasets in left menu, select New Dataset followed by Athena. Name the Datasource as instances-inventory and select Create.
  2. Select the database instances-inventory-database, select Athena as the data source.
  3. DataCatalog is pre-selected and in the Database select instances-inventory-database.
  4. Select aws_compliancesummary and select Directly query your data and select Visualize.
  5. Repeat steps 1 – 4 to create datasets for aws_instanceinformation, aws_complianceitem and aws_service.
  6. Now, we will create a data source by merging (joining) tables. Repeat steps 1 – 3, select aws_compliancesummary and select Edit/Preview data.
  7. In the upper-right corner, select add data, Select dataset as aws_instanceinformation.
  8. Select at the center of the join and, at the bottom, in Join type choose Left.
  9. In Join Clauses, select the resourceid of aws_compliancesummary, and the instanceid column of aws_instanceinformation.
  10. In the top left corner, change the name of the dataset to Joined-ComplianceSummaryDetails and select Save and Publish.
  11. Now, repeat steps 7-10 to create another join, but this time using instanceid column of aws_instanceinformation, and resourceid column of aws_complianceitem datasets. Rename the join dataset to Joined-ComplianceItemDetails. Select Save and Publish.
  12. Now, repeat steps 7-10 to create another join, but this time using instanceid column of aws_instanceinformation, and resourceid column of aws_service datasets. Rename the join dataset to Joined-ServiceDetails. Select Save and Publish.

AWS QuickSight Dataset Console

Figure 2: AWS QuickSight console showing the EdIt/preview Dataset Console

Create Amazon QuickSight Analysis and Dashboards

In this step, we will set up Amazon QuickSight dashboards for reporting on the metrics from many other metrics available in the aggregated inventory.

  • Overall multi-account patch compliance
  • Compliant and non-compliant instances
  • Instances with missing critical patches
  • Detailed list of missing patches
  • Detailed list of OS services status and versions

Overall Multi-account Patch Compliance

  1. Navigate to Amazon QuickSight console and in the left pane select Analysis
  2. In the top right corner, select New Analysis and select Joined-ComplianceSummaryDetails. Next, In the right corner, select Use Analysis and select Interactive Sheet and select Create.
  3. In the same QuickSight console, towards the top left corner, select Add Visual, select Pie-Chart and on the top in Field wells, under Group/Color, drag and drop status from the list.

Overall Patch Compliance

Figure 3: Pie-chart showing Overall Compliance Summary

Compliant and Non-compliant Instances

  1. On the top left corner, next to Dataset, select pencil icon, select Add Dataset and select aws_compliancesummary.
  2. In the top left corner select Add Visual, select Donut-Chart and on the top menu under Field wells the Group/Color. Drag and drop accountid. In the Value field, add resourceid then select on drop down on resourceid and expand Aggregate and select Count distinct.
  3. Select filter in the left pane, select Add Filter and under the status, select COMPLIANT.
  4. Add another filter, by selecting compliancetype as Patch.
  5. In the top right corner of the graph, select duplicate the visual and this time set the status filter to NON_COMPLIANT.

Compliant and non-compliant instances grouped by Account ID

Figure 4: Donut Chart showing Compliant and Non-compliant instances grouped by Account ID

Detailed list of missing patches

  1. On the top left corner, next to Dataset, select pencil icon and select Add Dataset then select aws_complianceitem.
  2. In the top left corner, select Add Visual, select Pivot-table Chart and on the top Field wells, under the rows, drag and drop accountid, region, resourceid, patchstate, id, and title.
  3. In the left navigation pane, select Filter and Add Filter and select Patchstate, Missing and select Apply.

Figure 5: Pivot table showing List of instances with detailed missing patches.

Detailed list of custom agent status and version

  1. On the top left corner, next to Dataset, select pencil icon, add Dataset and select Joined-ServiceDetails.
  2. In the left corner, select Visual and select Pivot-table Chart and on the top Field wells and under the rows, drag and drop accountid, region, resourceid, name, agentversion, and status.
  3. In the left pane, select Add Filter and under name select AWSLiteAgent.
  4. Note we have selected the AWSLiteAgent service for this example. In your case, you can report on any agent or service version that may be mandatory for your organization (e.g. Anti-virus software, observability agents)

Instances with AWSLiteAgent version and status.

Figure 6: Pivot table showing Instances with AWSLiteAgent version and status.

Centralized Backups: Management, Reporting and Alerting

In addition to centrally managing inventory and patching, customers would like to simplify their approach to backing up their resources on AWS, whether Amazon EC2 instances, Amazon EBS volumes or other services using data at rest. They would like a way of visualizing, reporting on and receiving alerts for their backups. Customers can easily identify resources or accounts that have fallen out of compliance in minutes with a centralized view straight from the AWS Backup console. They can log into the management account and have a single dashboard view of backup operations as they occur across accounts. This blog post on Managing backups at scale in your AWS Organizations using AWS Backup is a great starting point. Here, we will summarize the steps taken in that blog post, create backup reports, and show you how to set up alerts for failed backups.

1 – Creating a Backup policy, applying it to your resources, and monitoring backups
Ensure AWS Organizations has Backup policies enabled, by going to AWS Organizations > Policies > select Backup policies > select Enable backup policies.

Backup policies enabled

Figure 7: AWS Organization Console with Backup policies enabled

Then, sign in to your AWS Organizations management account, and navigate to the AWS Backup console. On the Settings page, under Cross-account management, select Turn On next to Backup policies and Cross-account monitoring.

Cross-account management settings in AWS Backup

Figure 8: Cross-account management settings in AWS Backup

Now, create a Backup policy:

  1. In AWS Backup, in the left-hand menu, go to My organization > Backup policies > Create Backup policy.
  2. You can then create a Backup policy either using the visual editor or inserting a JSON template. Find out how to create a Backup policy here.
  3. Here is an example policy that will back up resources in the eu-west-2 region on a daily basis at 05:00 UTC, retain those backups for 1 day, and store them in the Default backup vault. Resources will need to be tagged with a Key/Value pair of your choice to be captured.
  4. In the policy below, replace <your_Backup_Role> with the IAM role you currently use for backups. Replace <your_Tag_Key> with a Key of your choice, and <your_tag_value> with a Value of your choice:
  {
    "plans": {
        "example-backup-plan-ec2": {
            "regions": {
                "@@assign": [ "eu-west-2"]
            },
            "rules": {
                "example-backup-rule-daily": {
                    "schedule_expression": { "@@assign": "cron(0 5 ? * * *)" },
                    "lifecycle": {
                        "delete_after_days": { "@@assign": "1" }
                    },
                    "target_backup_vault_name": { "@@assign": "Default" }
                }
            },
            "selections": {
                "tags": {
                    "example-backup-resource-assignment": {
                        "iam_role_arn": { "@@assign": "arn:aws:iam::$account:role/<your_Backup_Role>" },
                        "tag_key": { "@@assign": "<your_Tag_Key>"},
                        "tag_value": {
                            "@@assign": [ "<your_tag_value>" ]
                        }
                    }
                }
            }
        }
    }
}

5. Apply the tags you created in your Backup policy to the resources (e.g. Amazon EC2 instances, Amazon EBS volumes) that you want to be backed up across your different AWS accounts.

6. Then, in your AWS Organizations management account, in the AWS Backup console, under My organization > Backup policies, select your newly created Backup policy.

7. In the Targets section, select Attach. This will open a tree view of all individual accounts and organizational units (OUs) in your AWS Organizations.

8. Selecting Root attaches your policy to all accounts in your organization, and selecting an OU attaches your policy to all sub-OUs and accounts within it.

Target OUs or Accounts for Backup policy

Figure 9 : Target OUs or Accounts for Backup policy

The backup plan you created in your Backup policy is now attached to all accounts in your selection. Any changes you make to the Backup policy are automatically applied to the backup plan in the attached accounts. In the event an account joins a selected OU, it receives the backup policy automatically, and likewise, if an account leaves the selected OU, the previously effective backup policy no longer applies. Under AWS Backup > My organization > Cross-account monitoring, you can view the backup status of jobs across your organization, and apply filters to your search, for example Account, Job ID and Status.

Figure 10: Cross-account monitoring of Backup jobs

2 – Creating a Backup Report
Once you have configured automatic, cross-account backups, you may also want to create reports to analyse, visualize or export information related to your backups. You can do this using Report plans within AWS Backup. There are two types of reports. One type is a jobs report, which shows jobs finished in the last 24 hours and all active jobs. The second type of report is a compliance report. Compliance reports can monitor resource levels or the different controls that are in effect. When you create a report, you choose which type of report to create. You can follow the steps in this documentation to first create your report plan, and then allow your S3 bucket to receive reports from AWS Backup. Once you have created a report plan, the reports will be stored in an S3 bucket you select, ready for you to view on a daily basis. These can then be shared with relevant team members, or visualized in a dashboard using Amazon Athena and Amazon QuickSight.

List of reports in AWS Backup

Figure 11: List of reports in AWS Backup

3 – Setting up automatic email alerts for unsuccessful backup jobs
While it’s useful to be able to monitor and report on your backup operations centrally, it will also be important for your teams to be notified in the event of an incomplete backup. We will outline the steps discussed in this AWS Knowledge Center post to show you how this can be achieved.

First, create an Amazon SNS topic to send AWS Backup notifications:

  1. Open the Amazon SNS console, choose Topics and select Create topic. Enter a name for the topic, then select Create topic.
  2. Under Details, copy the ARN value. Above the Details pane, choose Edit, expand Access policy, and append the following permissions into the policy in the JSON editor (replace the resource value with the ARN you just copied):
          {
          "Sid": "My-statement-id",
          "Effect": "Allow",
          "Principal": {
              "Service": "backup.amazonaws.com"
           },
          "Action": "SNS:Publish",
          "Resource": "arn:aws:sns:eu-west-1:111111111111:exampletopic"
          }

3. Select Save changes.

Next, configure your backup vault to send notifications to the SNS topic:

1. Ensure you have installed and configured the AWS CLI. Using the AWS CLI, run the put-backup-vault-notifications command with —backup-vault-events set to BACKUP_JOB_COMPLETED. Replace the following values in the example command:

–endpoint-url: the endpoint for the AWS Region where you have the backup vault

–backup-vault-name: the name of your backup vault

–sns-topic-arn the ARN of the SNS topic that you created

aws backup put-backup-vault-notifications --endpoint-url https://backup.eu-west-1.amazonaws.com --backup-vault-name examplevault --sns-topic-arn arn:aws:sns:eu-west-1:111111111111:exampletopic --backup-vault-events BACKUP_JOB_COMPLETED

2. Run the get-backup-vault-notifications command to confirm that notifications are configured:

aws backup get-backup-vault-notifications --backup-vault-name examplevault

The command returns output similar to the following:

{

"BackupVaultName": "examplevault",
"BackupVaultArn": "arn:aws:backup:eu-west-1:111111111111:backup-vault:examplevault",
"SNSTopicArn": "arn:aws:sns:eu-west-1:111111111111:exampletopic",
"BackupVaultEvents": [
    "BACKUP_JOB_COMPLETED"
    ]
}

Then, create an SNS subscription that filters notifications to backup jobs that are unsuccessful:

  1. Open the Amazon SNS console, choose Subscriptions and then select Create subscription.
  2. For Topic ARN, select the SNS topic that you created earlier.
  3. For Protocol, select Email-JSON.
  4. For Endpoint, enter the email address where you want to get email notifications about failed backup jobs.
  5. Expand Subscription filter policy > in the JSON editor, enter the following:
   {
      "State": [
                   {
                       "anything-but": "COMPLETED"
                   }
       ]
    }

6. Choose Create subscription.

7. The email address that you entered in step 4 receives a subscription confirmation email. Be sure to confirm the SNS subscription.

Once this has been set up, you can monitor your emails for notifications. When your vault has an unsuccessful backup job, you get an email notification similar to the following:

  "An AWS Backup job was stopped. Resource ARN : arn:aws:ec2:eu-west-1:111111111111:volume/vol-example56d7w92d4b. BackupJob ID : example4-3dd5-5678-b52d-90bd749355a5"

You can test notifications by creating two on-demand backups and then stopping one of the backups. You get an email notification for the stopped backup only.

Cleanup

To avoid incurring additional cost, you should delete the resources you do not intend to use.

Resources created for Glue and QuickSight Service

  1. Go to AWS CloudFormation console choose the stack you created, and then choose Delete.
  2. Navigate to AWS Glue console and expand Data Catalog in the left pane and delete Crawlers, Tables and Database created part of this blog.
  3. Navigate to Amazon QuickSight console and delete the Analysis by Selecting three dots on the Joined-* Analysis and select Delete and confirm.
  4. Again, in the Amazon QuickSight Console, select datasets in the left pane and select three dots on the instances-inventory dataset and select Delete and confirm.

Resources created for AWS Backup Service

  1. In the AWS console, navigate to AWS Organizations Policies and select Backup policies. Select Disable backup policies in the upper-right section of the page.
  2. In your AWS Organizations management account, navigate to the AWS Backup console. On the Settings page, under Cross-Account Management, select Turn Off next to Backup policies and Cross-account monitoring.
  3. In the AWS Backup console, in the left-hand menu, under My organization, select Backup policies.  Select a Backup policy and remove the Targets and select Delete in the upper-right section of the page, repeat for each Backup policy.
  4. Delete any Backup policy tags that you applied to your resources such as Amazon EC2 instances, Amazon EBS volumes etc.
  5. In the AWS Backup console, under Backup Audit Manager in the left-hand menu, select Reports. Select the Report plan you created, select Actions and then select Delete.
  6. Navigate to Amazon SNS Topics, select the topic you created and select Delete.

The backups you created will be deleted from their backup vault after the duration that you selected when creating your backup policy.

Conclusion

In this two-part blog post series, we showed you how to centralize operations and governance across multiple AWS accounts and regions within an AWS Organization. In Part 1, we covered the foundational steps to prepare the environments and enforce required tools and policies through AWS Cloud Operations services. In Part 2, we showed you how to build centralized management, visualization and reporting on operational tasks such as patching, mandatory software compliance and backups. For patching and mandatory software, we used Amazon S3, Amazon Athena and Amazon QuickSight. For backups, we used AWS Backup and Amazon SNS.

By implementing these patterns, you can automate and centralize operations across your multi-account environment. This can quickly deliver efficiency gains for your operations staff, and minimize the risk of not meeting compliance standards through human error or manual misconfiguration.

About the authors

Elias Bedmar is a Customer Solutions Manager at AWS. He is a technical and business program manager helping customers be successful on AWS. He supports large migration and modernization programs, cloud maturity initiatives and adoption of new services. Elias has experience in migration delivery, DevOps engineering and cloud

Ravindra Kori is a Solutions Architect at AWS. He has worked with multiple segments of customers from Enterprise, SMB and Startup, where he helped our customers architect their solutions and migrate to AWS. He specializes in Cloud Operations and Serverless in AWS.

Fred Hoskyns is a Senior Solutions Architect at AWS who specializes in helping companies of all sizes translate their business requirements into technical solutions, having spent half a decade in sales roles at Amazon.com and AWS. He is passionate about enabling customers to use AWS builder tools and Serverless technology to delight their end users.