Increase visibility and governance on cloud with AWS Cloud Operations services – Part 1
Many customers are migrating to AWS to leverage cost reduction, boost staff productivity, improve operational resilience, and increase business agility. As they adopt AWS, they will leverage multi-account architectures to meet business, governance, security and operational requirements. Operation teams (some of which may be used to on premises) can use this opportunity to improve visibility and governance across their entire estate on AWS.
This two part post provides foundational tooling that will help you centralize and automate operations, and improve governance and visibility through AWS Cloud Operations services. The patterns described in this blog apply at the AWS Organizations organization or Organizational Unit (OU) level and across regions. This makes the solutions scalable and extensible to new member AWS accounts. The focus will be on virtual server management tasks, but the patterns can be expanded to other operational tasks. The diagram below illustrates the architecture:
- You have a multi-account architecture in place managed with AWS Organizations.
- AWS Organizations trusted access is enabled for AWS Config, AWS Systems Manager and AWS Backup.
- You are familiar with the use of AWS Systems Manager to manage Amazon Elastic Cloud Compute (Amazon EC2) instances and the AWS Systems Manager Agent (SSM Agent).
Centralized Governance and Configuration Management
The following steps are the foundations for your operations and governance at scale, and underpin centralized tooling and automation tasks.
1 – Set up AWS Config across all accounts and regions using AWS Systems Manager
AWS Config allows you to keep track of your AWS resources (and any changes to their configuration) at a detailed level. This enables many operations use cases such as resource administration, audit and compliance, configuration management and security analysis. We recommend you enable AWS Config recording in all your accounts and regions by following these steps. We recommend you select the entire organization or OU as the target, and all relevant regions to be recorded. The configuration schedule can be set to ‘daily’, so any new accounts or regions added to the organization will be detected and included in the tooling. For the use cases in this post, we only selected EC2 instances and SSM resources to be recorded.
2 – Set up an AWS Config aggregator at the organization level
An AWS Config aggregator is a centralized view of your AWS Config information across multiple accounts and regions. You should set this up at the organization level to aggregate the configuration and compliance data from all accounts being recorded by AWS Config (which you configured in step 1). This will enable two key operational tools:
a.Centralized resource list: A centralized inventory of all resources across multiple accounts regions, including their configuration data, with search and filtering functions. You can access this inventory in the Aggregated resources page.
b.Advanced queries: AWS Config Advanced Queries is a feature to query resource information and configuration states. It is a very powerful tool for operations teams, as you can instantly get real-time key information from your entire multi-account multi-region AWS estate in a centralized way. It leverages structured query language (SQL) syntax and allows exporting data to CSV. Example queries include: list all running/stopped EC2 instances, list all EC2s with AMI id-12345, list all EBS volumes that are not in use, and list all unencrypted EBS volumes. You can access the advanced queries tool by navigating to Advanced Queries in the AWS Config console. Ensure you select the aggregator set up on step 2 as the ‘Query scope’ for your queries. This set-up will be useful for ad-hoc queries but the query syntax will have some limitations due to being a subset of SQL.
3 – Set up AWS Config rules across accounts and regions
AWS Config rules evaluate the configuration settings of your AWS resources periodically against operational and security best practices. AWS Config conformance packs are a collection of rules. You should set up conformance pack at the organization level so all new member accounts are automatically included in the evaluation scope. At present, 2 methods are available to deploy a conformance pack at the organization level:
|Method||AWS Service used||Technical considerations||Implementation|
AWS Systems Manager
(Quick Setup feature)
-Custom conformance packs are supported (through YAML templates stored in S3).
-Multi-region deployment is supported
|Follow these steps to Deploy AWS Conformance Packs (AWS Systems Manager Quick Setup).|
|AWS CLI or API||AWS Config||
-Custom conformance packs are supported (through YAML templates stored in S3).
-Multi-region deployment not available (API call is region-specific).
Note that the API call to deploy rules and conformance packs across accounts is region specific. At the organization level, you need to change the context of your API call to a different region if you want to deploy rules in other regions.
Centralized compliance reporting and remediation: Compliance status against all rules for resources across accounts and regions will be available centrally in the aggregator dashboard (set up in step 2). You can also define remediation actions (manual or automated) for non-compliant rules and conformance packs (out of the scope of this blog). This blog post will focus on just one rule within a custom compliance package (enforcing SSM agent, described in Step 4), but you can increase the number of rules you apply as your AWS adoption increases. You can also leverage AWS Config Conformance Pack sample templates provided by AWS that include operational best practices for standards such as CIS, PCI, CISA, FedRAMP, and HIPAA, as well as workload-specific security requirements such as Amazon Relational Database Service (Amazon RDS), AWS Lambda or various container services.
4 – Enforcing the use of the SSM Agent on all EC2 instances
The SSM Agent is the ‘heartbeat’ of Amazon EC2 instance management and operations at scale. It provides periodic, detailed OS-level information about the instances, and provides the telemetry for many useful node management activities at scale through automation. We highly recommend you follow a managed node approach on AWS to maximize control and visibility (i.e. all EC2 instances to be configured with SSM Agent). You can enforce this across your organization (as explained in step 3) through the AWS Config managed rule ec2-instance-managed-by-systems-manager. You can deploy this rule as a standalone rule or within a custom conformance pack. A rule remediation action for non-compliant instances can be added to install or configure the agent. You can view all non-compliant accounts and instances in the Aggregated Rules page.
Detailed list of individual EC2 instances not managed by SSM Agent.
In this blog post, we showed you how to prepare your multi-account multi-region AWS Organizations for centralized management and visibility at scale using AWS Cloud Operations services. We covered the automated enforcement of AWS Config recording and AWS Config rules and conformance packs for all existing and new accounts, the creation of an AWS Config aggregator for a centralized view of resource inventory and configuration conformance, as well as the enforcement of SSM Agent for proper management of all EC2 instances. We encourage you to adopt these foundational patterns to quickly improve visibility and centralize governance of your cloud resources.
In Part 2 of this series, we build on these steps and show how to centrally manage, visualize and report on operational tasks such as patching, mandatory software compliance and backups.
About the authors