Guidance for Change Management on AWS

Change Management enables you to deploy planned alterations to all configurable items that are in your environment within the defined scope, such as production and test. An approved change is an action which alters resource configuration implemented with a minimized and accepted risk to an existing IT infrastructure.

Architecture Diagram

[Architecture diagram description]

Download the architecture diagram PDF

Guidance Architecture Diagram for Change Management on AWS

Step 1
Deploy an AWS Config recorder and delivery channel in the management account and all member accounts to track changes of all AWS Config supported resources. Note: AWS Control Tower will automatically deploy AWS Config and delivery channels in all AWS Control Tower managed member accounts and regions.

Step 2
Use AWS Organizations to delegate the administration of AWS Systems Manager and AWS CloudFormation to the operation tooling account. This will allow you to deploy changes by automation to accounts within your organization. For application-specific changes, we recommend using an application deployment account (not depicted here).

Step 3
Enable and configure an AWS Config aggregator in your operations tooling account to provide visibility to the resources, resource configurations, and changes in your organization.

Step 4
Configure all AWS Config delivery channels to send AWS Config snapshots and history to a centralized Amazon Simple Storage Service (Amazon S3) bucket in your log archive account to maintain a record of all historical changes. Note: If you are using AWS Control Tower, each AWS Config recorder has a delivery channel configured to send to the AWS Control Tower S3 log bucket.

Implementation Resources

The purpose of change management is to control the lifecycles of all changes, allowing changes to be made with minimal interruption to IT services. Changes to an environment introduce risk, so it is important to ensure changes are prioritized, planned, tested, implemented, documented and reviewed to mitigate risk. The Change Management capability ensures that changes are understood, recorded, and evaluated prior to, during, and after implementation. It also complements the process and safety controls of your continuous integration (CI) practices and Continuous Delivery/Deployment (CD) methodology. The Change Management capability is not intended for changes made as part of an automated release process, such as a CI/CD pipeline, unless there is an exception or approval required for major or broad changes.

ITIL defines change management as “the process responsible for controlling the Lifecycle of all Changes. The primary objective of change management is to enable beneficial changes to be made, with minimum disruption to IT Services.” Every change should deliver business value and the change management processes should be geared towards enabling that delivery. ITIL states a number of benefits for effective change management, including “reducing failed changes and therefore service disruption, defects and re-work” and “delivering change promptly to meet business timescales”. (ITIL Service Transition, AXELOS, 2011, page 44)

The key concepts of change management remain the same in the cloud as on-premises. Change delivers business value, and it should be delivered efficiently. Agile methodologies and the automation capabilities of the cloud go hand in hand with the core principles of change management, as they are also designed to deliver business value quickly and efficiently. There are some key areas, such as the management of infrastructure with software defined solutions, that may require existing change processes to be modified to adapt to new methods of delivering change.

Establish a change management process

Scenario
Overview
Implementation

Scenario
Establish a change management process
- Define change management scope, categories, prioritization, and approval process
- Define the required information included in the change request
- Select a change management tool(s) to manage and track changes
- Establish a change scheduling and communication process
- Establish process for emergency change deployment and resolution
- Establish a change approval and communication process changes that have inter-dependencies between services
Overview

Change management practices are designed to reduce incidents and meet regulatory standards. These practices ensure efficient and prompt handling of changes to IT infrastructure and code. Modern change management methodologies can include rolling out new services, managing current ones, resolving problems in code, breaking down silos, providing context and transparency, eliminating bottlenecks, and minimizing risk.

The change control practice ensures that risks are properly assessed, authorizing changes proceed, and a change schedule is managed to maximize the number of successful service and product changes.

Change management scope

Change management is the practice of monitoring resource configurations to establish and manage a baseline. A baseline is a snapshot of a set of configurations at a given point in time. This snapshot enables you to identify different configuration states of a given configuration item (a snapshot of a particular configuration at a point in time). The purpose of your change management should be to control the changes made to the baseline in a safe and controlled method. The scope of your environment baseline needs to be defined so it doesn’t conflict with any DevSecOps practices that deliver their own change management through a release management process. Any configurations that are not managed by a DevSecOps release management process should be tracked as a change management process. Common configuration items for change control can be:

Enterprise-wide configurations completed one time via manual effort
Temporary suspension of enterprise-wide policies
User access or membership changes to groups
Centralized changes that impact multiple groups

Note: DevSecOps is the preferred method to implement change and should follow the general principals of change control. Change management as defined in this capability is inclusive of DevSecOps changes in regard to progressing change fulfillment to an automated process, using infrastructure as code.

Change Management categories and priorities

Categories
Changes can be grouped and categorized based on the perceived level of impact and urgency. Changes come in three categories: standard, normal, and emergency.

Standard change – This change is an established change that is low-risk and well understood; therefore, it does not need a formal review and approval process. These changes use a condensed version of the normal change procedures that simplifies the process to quickly satisfy the change requester’s need. For example, when a user requests that an internal storage resource is created within your environment. This change request has low risk, and can frequently be automated with a self-service change management fulfillment service. Standard changes are submitted using a request for change process and reviewed by an approving body, sometimes known as a Change Advisory Board (CAB).
Normal change – This change is unique, and has a higher risk of uncertainty of the anticipated outcome. This is typically defined as the default change, and warrants a formal review and approval process to initiate the change implementation. For example, when a new service requires a firewall rule to be updated to allow incoming from a non-standard port.
Emergency change – This change addresses unforeseen operational issues, such as failures, security threats, and vulnerabilities. This type of change is a rapid change that is required to continue or restore business operations, to address significant risk, or to solve emergency business needs. Emergency changes should still follow documented procedures and use the organizational emergency change management process; however, the approval process is streamlined by defining a smaller approval body commonly known as the Emergency Change Advisory Board (ECAB). The ECAB is a special group convened to advise on approval/rejection, and planning for emergency changes. The membership to the ECAB includes people with experience and authority to assume risk to make rapid decisions.

Depending on the categories you choose, you will need to provide a process for approval. Standard changes require a formal review process, and approval from a change approving authority such as a CAB you will set up. Emergencies require a fast reaction; however, they still warrant an approving body. Depending on the risk your organization wants to take, change approval should be documented and have some type of approving authority for anything outside of the day-to-day changes. However, not all changes require a formal review and approval; they merely require some form of request submission and automated approval. As customers move toward automation, change management can be done in a highly automated and scripted manner with minimal manual input.

Note: Recurring changes that are tasks with low risk should be categorized as standard and auto-approved. Change management should focus on enablement, and not become a bureaucratic roadblock to delivering value.

Priority
Changes should also be defined with a priority level to assist in identifying the impact and urgency of the change.

Impact is a measure of the effect of an incident, issue, or change on a business process. Impact is often based on how service levels will be affected.
Urgency is a measure of how long it will be until an incident, problem, or change has a significant business impact. For example, a high impact incident may have low urgency if the impact will not affect the business until the end of the financial year.

Creating a matrix to assist in identifying the priority of change request helps facilitate a standardize process to identify the priority of the change and schedule the implementation of the change appropriately.

Initiating request for change

An essential part of change management is the request for change process, documents, and end-users’ requests for change that starts the review and approval process. Changes that fall within scope for your change management process require a Request for Change (RFC) to be submitted for review and approval. A change request is a formal communication looking for a modification to one or more configuration items within scope of your change management system. A request for change can include the following:

RFC document
Service desk call
Project initiation document
Tickets generated from an IT support service

We recommend that you determine the applicable information to support your change control process. Depending on the category of change, a request for change can capture different levels of detail. Basic information that needs to be captured includes the service or configuration item that will be changed, who is submitting the change, and justification for the change. There needs to be enough information that the approving authority can approve or deny the change. The following table gives an example of basic information which should be captured on all RFCs:

RFC data point	Description
Submitted By	Point of contact for change initiation
Date Requested	Date the request for change was submitted
Title of Change	Title of the change
Description of Change	Brief description of the change
Justification of Change	Reason change is necessary
Identified Risks	A list of identified risks and their risk mitigation efforts
Impact Scope	The impact the change will have within the environment
Schedule	Schedule of the maintenance window for the change to be completed in
Suggested Implementation	Plan to implement the change
Rollback	Plan to rollback change if necessary

Change communication

The change management process fulfills the review, approval, and orchestration of changes. A cohesive communications plan is a critical component for success, to help further mitigate risks when making IT changes. When all the right people are involved - the technical manager, the business manager, the people making the change, and the business users – all the right tools are provided, and all the right fallback actions are tested, a communications plan helps ensure a comprehensive and successful change.

An important part of the process is to communicate the intended change and, if necessary, coordinate a maintenance window to implement the change. Change management requires that all stakeholders for the change are notified if the change will impact their service or operations.

Change management tools

Manual or automated tools can be used. to facilitate configuration and change management. Tool selection should be based on the needs of the project stakeholders, including organization and environmental considerations and/or requirements.

Change management tools can integrate with the operating environment facilitating workflows that deliver communication, change orchestration, approval processes, and configuration item baseline tracking and monitoring. This type of change management tool often requires an advanced environment setup that supports workflow templates for automation, and can take a considerable amount of planning and setup. They require an understanding of the scope of change you want to manage and well-defined scripted steps for change implementation.

Implementation

Define change management scope, categories, prioritization, and their approval process

Your change management framework needs to clearly define scope, categorization, prioritization, and a process to approve changes. Defining these areas helps you control the introduction of change into your AWS environment. This is to ensure that it is clear to your organization what will be managed under the change management process, and how those changes are to be submitted and approved.

Scope
Within your AWS environment, you need to delineate what is managed via your change management process, and what is managed through your Continuous Integration/Continuous Delivery (CI/CD) processes that use infrastructure as code (IaC). Change Management and CI/CD processes are both part of your fundamental change control, and it is important to note that your change management process should complement your CI/CD processes. However, Change Management and CI/CD should not overlap nor hinder each other. The change management process uses a formally assembled Change Management Board (CAB) that approves changes on a routine periodic basis. If they try to put all change through the CAB, your DevSecOps teams will find their technical velocity slow, which in turn slows growth. Regardless of delineation, it is important that all changes in your environment should be managed with a review, approve, and communicate methodology.

Changes that flow through your change management process should be viewed as changes that require formal approval and review. For your baseline foundational environment, many of your changes within your AWS environment will fall under change control. However, as your environment matures to more of an agile CI/CD process, you will find that changes that are typically conducted one time or are broad in impact will require going through your formal change management process.

When starting out with AWS, we recommended including the following areas for your change management scope:

AWS Service Control Policies deployment or alterations
AWS Organization service integration enablement
Financial purchases such as Reserved instances or Savings Plans
AWS Resource Access Management (RAM)
Any centralized networking configurations

There are many different services you can track in your AWS environment. AWS Config is a Region-based managed service that enables the tracking of AWS resource baselines over time. The service can be deployed to specific accounts and Regions within your environment. It can be further configured to monitor only specific AWS services. AWS Config can assist you in identifying what services you may want to monitor for change, and where you would like to monitor them. To view the latest AWS services which AWS Config tracks, refer to Supported Resource Types.

Categories
After defining the scope of what you will be managing in the change management framework, you will need to identify the categories you want to support for defining changes submitted in your change request fulfillment. Within AWS, you can adhere to the standard categories supported in the ITIL framework.

Normal change: Identified normal changes in AWS need to have the request fulfillment automated, to ensure that the change is done in a controlled and predictable method. These tasks can include granting user access, creating internal low risk AWS resources, or AWS account-level configuration changes that are low risk and low impact to your AWS environment. Normal changes can be commonly automated via AWS Service Catalog or AWS Systems Manager (SSM) Automation.

Note: Recurring changes that are tasks with low risk and low impact should be categorized as normal and auto-approved. Change management should focus on enablement, and not become a bureaucratic roadblock to delivering value.

Standard change: Standard changes in AWS require a formal review and approval from your identified change approval board (CAB). These types of changes should define what AWS services they will impact, how the service will be impacted, and provide a plan to implement and rollback as needed. Standard changes in your AWS environment need to be communicated out in advance to the necessary stakeholders, given the scope of the change within your AWS environment. Standard changes in AWS are considered to be broad one-time changes, which include activities such as purchasing savings plans and reserved instances, enabling AWS Organization configurations, setting AWS account level configurations that can impact multiple AWS accounts or services, and so on.
Emergency change: Emergency changes in AWS are changes that expedite implementation to remediate an ongoing operational or security issue. Implementing an emergency change still involves documentation and approval process; however, the process should be streamlined to implement in a timely manner. Using an emergency CAB (ECAB) for your AWS environment facilitates the necessary speed to implement. Emergency changes in AWS often require temporary elevated privileges into an AWS account.

Prioritization
Change prioritization is based on the urgency and impact of the change. The change requestor will suggest the initial impact and urgency. To help change requests appropriately identify impact and urgency, apply matrices to them. This assists in expediting and documenting changes to your environment. After further assessment, the CAB can modify or update the initial prioritization.

The following table lists the criteria for determining the impact of a change and the approval actions needed to address the impact:

Impact level	Definition
Extensive	There is significant business service impact base on the broad scope the change. The change will impact multiple AWS accounts or workloads, which in turn will impact multiple customer-facing services. The RFC must be discussed in the CAB meeting and approved.
Significant	There is clear service impact, because the change would impact at least one customer-facing service. The RFC must be discussed in the CAB meeting and approved.
Moderate	There is little impact on current services that support customer-facing workloads, and the scope of the change is more than one AWS account.
Minor	The change is small in scope, impacting only one AWS account, and no customer-facing workloads.

The following table lists the criteria for determining the urgency of a change:

Urgency	Action Required
Critical	The change is immediately necessary to prevent severe continued or eminent business impact. Change is sent to emergency CAB for approval.
High	The change is needed as soon as possible because of potentially damaging service impact. The change will be immediately implemented once approved.
Medium	The change will solve minor degraded performance or an irritating problem. This change can be scheduled, and is not immediately needed once approved by CAB.
Low	The change will lead to improvements, changes in workflow, or configuration. This change can be scheduled, and is not immediately needed once approved by CAB.

Based on urgency and criteria, a priority can be established for any change request. The following matrix correlates impact and urgency results to define the priority for a change within your cloud environment.

Priority Criteria:

	Urgency
IMPACT	Critical	High	Medium	Low
Extensive	1	1	2	4
Significant	1	2	3	4
Moderate	2	2	3	4
Minor	2	3	3	4

Approval Process
Establishing a Change Advisory Board (CAB) helps you identify a group responsible to review and approve standard or emergency changes. It is an advisory body that supports change authorization and helps the change management in change assessment, prioritization, and scheduling. The CAB can also approve what standard changes can be implemented without CAB review per requested change. To define the membership of your CAB, you can use a Cloud team, or a subset of the team can serve as members of the CAB.

A cloud team can take responsibility for establishing change management review and approval processes within your AWS environment. An effective cloud team starts small, develops an approach for implementing cloud technology at scale for your organization, and can become the fulcrum by which your organization transforms the way technology serves the business. This team can also serve as the governing body that convenes, reviews requested changes, coordinates, approves, and communicates change.

The cloud team drives the established standards across your organization, helping to drive value-added change within the organization while identifying and reducing potentially restrictive and unnecessary bureaucratic process. Additionally, the cloud team can identify changes that can be categorized as a standard change, and developed into automated process for self-service change request fulfillment.

Define the required information included in the change request

When first implementing change management in AWS, you may use your existing change management approach to manually record any and all changes made to your AWS environment. You may need to update your change management process to include new data points for your RFC process. Consider the following data points to collect from your RFC submission process that reflect key AWS specific details for change assessment.

RFC data point	Description	AWS environment considerations
Submitted By	Point of contact for change implementation	Map to business unit or workload point of contacts
Date Requested	Date the request for change was submitted
Title of Change	Identifiable title of the change
Description of Change	Brief description of the change
Justification of Change	Reason the change is necessary	Business reason for changing AWS account(s) baseline
Identified Risks	A list of identified risks and their risk mitigation efforts	What are the known risks, and how are they mitigated or rollback in AWS environment?
Impact Scope	The impact the change will have within the environment	Consider defining the scope by AWS Organization, Account IDs, Workloads, or Workload environments. Define the impacted users, such as internal users or external customers.
Schedule	Schedule of the maintenance window for the change to be completed in	What other scheduled events operate within the impacted AWS environment
Suggested Implementation	Plan to implement the change	What configuration items in the AWS environment will change or be impacted
Rollback	Plan to rollback change if necessary	All change requests should include a plan to rollback requested changes to the previous baseline. A plan is needed to detail and estimate timeline.

Identify a Change Management Tool(s) to manage and track changes

AWS Config
To manage configuration items In the AWS Cloud, AWS Config can be used to assess, audit, and evaluate the configuration of AWS resources, allowing you to continuously monitor and record AWS resource configurations. With AWS Config, you can track the relationships among resources, and review resource dependencies prior to making changes. Once a change occurs, you can quickly review the history of the resource's configuration, and determine what the resource’s configuration looked like at any point in the past. AWS Config provides you with information to assess how a change to a resource configuration would affect your other resources, which minimizes the impact of change-related incidents.

If you have not set up AWS Config already, you can set it up in one of the following ways:

Console or CLI – You can manually enable AWS Config using the AWS Config console or CLI. Refer to Getting started with AWS Config in the AWS Config Developer Guide.
AWS CloudFormation template – If you have integrated with AWS Organizations or want to enable AWS Config on a large set of accounts, you can easily enable AWS Config with the CloudFormation template Enable AWS Config. To access this template, refer to AWS CloudFormation StackSets sample templates in the AWS CloudFormation User Guide. For more details about using this template, refer to Managing AWS Organizations accounts using AWS Config and AWS CloudFormation StackSets.
GitHub script – Security Hub offers a script in GitHub that allows you to enable multiple accounts across Regions. This script is useful if you have not integrated with Organizations, or if you have accounts that are not part of your organization. When you use this script to enable Security Hub, it also automatically enables AWS Config for these accounts.

For more information, refer to Enabling and configuring AWS Config for a detailed instructions on deploying AWS Config within your environment. Based on the scope you defined regarding what configuration items you will monitor and what environments are part of your change management, you should deploy AWS Config out to the appropriate AWS account and Region.

Note: AWS Config is a Region-based service, which means you will need to deploy AWS Config to all applicable Regions you have enabled for your AWS environment. To deploy to multiple Regions, use the AWS CloudFormation template with a StackSet deployment from your Ops Tooling account, or any AWS account you have delegated AWS Organizations CloudFormation service to.

Note: For AWS Control Tower Users, the AWS Config deployment with change management tracking is already accomplished for you on deployment of Control Tower. Reference the Control Tower Govern Resource Configurations with AWS Config documentation for the latest details on the deployment and its associated configurations.

Once AWS Config is deployed, you can view the history of your configuration items. Refer to the AWS Config documentation for Viewing Configuration History for latest documentation. The data for AWS Config can be queried.

Establish a change scheduling and communication process

Change scheduling and communication in AWS requires that your AWS operations and development teams are made aware of upcoming, in-progress, and completed changes. A principal of modern change management in AWS is to make many small incremental changes one at a time; this is core to Agile development. For your foundational environment, some of your changes will be broad and impact the entire AWS organization. These types of changes require that you schedule and communicate the change request and change implementation.

All your AWS accounts should establish email distribution lists that will be used to notify the account owners and stakeholders of change details, change status, and any change monitoring alerts configured. AWS provides management services and built-in contacts to help identify those contacts, and target your scheduling and communication practices appropriately. Your cloud team should establish a change communication plan that includes some of the following email distribution groups when applicable to your AWS environment:

Distribution group	Description
[Account-Alias]-ChangeNotification	Communications sent to this distribution group are sent to all stakeholders of the impacted AWS account.
[Workload]-ChangeNotification	Distribution groups created per AWS workload operating in your environment. This type of communication is only sent when the applicable workload is impacted.
Production-ChangeNotification	Communications sent to this distribution group are sent to all stakeholders who are impacted by changes to all production accounts.
Development-ChangeNotification	Communications sent to this distribution group are sent to all stakeholders who are impacted by changes to all development accounts.
Sandbox-ChangeNotification	Communications sent to this distribution group are for changes that impact all sandbox accounts in your AWS environment.
AwsOrganization-ChangeNotification	Communications sent to this distribution group are for when changes impact all accounts operating in an AWS Organization.

AWS Simple Notification Service
Amazon Simple Notification Service (Amazon SNS) is a fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication. Use AWS SNS to create and configure topics that can target specific stakeholders per account for any change notification. The topics should then be subscribed to by email addresses that are group distribution lists. Create an SNS topic for each AWS account to communicate any change details that impact the applicable AWS account, and use those SNS topics to send change monitoring alerts. To configure SNS topics, refer to the Configuring Amazon SNS.

AWS Account Alternate Contacts
AWS provides standardized alternate contacts per account, which can reflect the direct stakeholders that need to be informed when a change impacts the AWS account. Ensuring this information is filled in per AWS account in your environment helps facilitate the change communication process, as well as targeting a specific group of stakeholders by impacted AWS accounts.

Info: AWS account alternate contacts are used by AWS to send business-related communications to the point of contact. Ensure you provide the appropriate point of contact, as AWS will reach out to each contact based on the business functionality.

The available alternate contacts are based on general business functions of the AWS account, which include Billing, Operations, and Security. AWS recommends using email distribution lists that enable multiple recipients to receive notifications from a single email address. Follow the instructions at Accessing or updating the alternate contacts to add, update, or remove alternate contacts from the individual account, or from all accounts using AWS Organizations.

Define change management request fulfillment process
- Scenario
- Overview
- Implementation
- Scenario
- Define change management request fulfillment process
  
  Define the standard process for each change management category to be implemented
  
  Deploy changes via automation using standardized processes
  
  Develop common change management runbooks
  
  Develop pre-approved change management templates for automation
  
  Integrate the change management process with the change management fulfillment system
- Overview
- Approved RFCs need to go through a process of fulfillment, which includes scheduling and implementing the approved change. A key consideration is understanding the risk of deploying the change to the cloud. The goal of the change management fulfillment process is to understand the risk and provide risk migrating efforts. A proven strategy to mitigating change risk is to standardize the method to implement types of change. Organizations can create their own runbooks to implement a standardized or coded series of steps that include change validation and potential rollback plans. This ensures repeatability and consistency across multiple environments, as well as enabling automation of software testing, compliance testing, security testing, and functional testing.
  
  The change management fulfillment process should always be working to improve itself, and add to its runbooks to facilitate faster change with mitigated risks. As you develop your own runbooks, you should review your categories of change attempting to move any normal change to an automated standard change that can be self-service, using an integrated change management service or solution.
- Implementation
- Define the standard process for each change management category to be implemented
  
  Standard Change in AWS
  Identified standard changes in AWS need to have the change request fulfillment automated to ensure that the change is done in a controlled and predictable method. AWS provides AWS Systems Manager Automation runbooks that allow your team to create workflows based on scripts, API actions, AWS Lambda functions, AWS Step Functions, and other various AWS automation services.
  
  Normal Change in AWS
  Normal changes in your foundational environment are often manually implemented during a coordinated and communicated change or maintenance window. The change should go through the RFC process you establish, and be reviewed and approved by the CAB. Normal changes should be communicated prior to change to the appropriate AWS account stakeholders within your environment. Upon successful implementation of the change, communication of change closure to the same stakeholders should follow. As you identify types of recurring normal changes, you should work to create AWS SSM Automation runbooks that can safely fulfill the change request.
  
  Emergency Change in AWS
  Within AWS, emergency changes are defined as those needed to be implemented as soon as possible to remediate an impacted service. These type of changes are implemented manually, and often require elevated access within your environment and can be completed with a properly scoped break glass role within the impacted AWS account or environment. Changes implemented under the emergency change should be reviewed and approved by an emergency CAB (ECAB) team member (ECAB is a subset of the CAB group membership). The approving ECAB team member must be capable of taking on the responsibility of approving a change without seeking to convene the CAB for approval.
  
  Deploy changes via automation using standardized processes
  
  Systems Manager Automation
  Using AWS Systems Manager, you can automate operational tasks to help make your teams more efficient and create a standardized process to implement changes categorized as standard. With automated approval workflows and runbooks with rich text descriptions, you can reduce human error, impose risk mitigating operations, and simplify the process to impose change on AWS resources. You can use predefined automation runbooks, or build your own to share for common operational tasks such as stopping and restarting an EC2 instance. Systems Manager also has built-in safety controls, enabling you to incrementally roll out changes and automatically halt the roll-out if errors occur.
  
  Using the predefined AWS-supplied SSM Automation runbooks will assist you in initially qualifying what is a normal change in your environment, and identify the tasks you can implement using a self-service change fulfillment process via SSM Automation. Start with reviewing the list of available AWS SSM documents by logging in to the SSM web console (login required) and searching under the shared resources link under the left side vertical menu. To find other was to search through the predefined AWS supplied SSM docs, refer to Searching for SSM documents.
  
  SSM documents promote management as code for remotely managing instances, ensuring desired states for your resources, and automating IT operations. You can use predefined documents, or you can author your own in JSON or YAML. Your change management process should initially start out using the supplied SSM documents to assist in deploying your normal changes.
Recover from failed change operations
- Scenario
- Overview
- Implementation
- Scenario
- Recover from failed change operations
  
  Integrate contingency planning for change management process for failed changes
  
  Define standards for remediation actions taken to recover after a failed change
  
  Automate the detection and remediation of failed changes
- Overview
- Changes should not be approved without considering the consequences of a failure or degraded performance outcome. There should be a back-out or rollback plan which will restore the resource to its initial baseline configuration prior to the implemented change. The cloud enables rollback plans to be fully automated using repeatable processes. Not all changes are reversible. However, the process to recover from failed changes should be documented, and the risk accepted when approving the change. If failed changes have a lower impact due to the speed and consistency of rollback, activating rollbacks should be considered as part of the normal process. This is particularly true if it’s possible to quickly remediate the issue and push it through the same automated pipelines, to quickly deliver the original intended business value of the change.
- Implementation
- Integrate contingency planning for change management process for failed changes
  
  Changes to your AWS resources should have a beneficial impact to your environment, and meet the desired outcome defined in your RFC. Changes that have adverse impact to your AWS environment should be rolled back to return the impacted resource back to the previous baseline.
Establish mechanisms to assess, review, and monitor change
- Scenario
- Overview
- Implementation
- Scenario
- Establish mechanisms to assess, review, and monitor change
  
  Assess and evaluate the impact of a change
  
  Track environment baseline changes
  
  Monitor environment baselines for change
  
  Audit all change management activity and timelines
- Overview
- All changes within the scope of your change management should be monitored and tracked for approved and out-of-band changes within the environment. Every change that is approved and implemented should have a positive impact to the overall service, such as a security gain or performance gain within the environment. Changes that are tracked and have adverse effects to the environment should be rolled back immediately, which will restore the baseline to its previous state. The goal of continuously monitoring your baseline and alerting on configuration item changes is that all change introduced into the environment is controlled. Changes that happen outside of the management process need to be tracked down and reverted back to their previous baseline, or documented in your change management system.
- Implementation
- Assess and evaluate the impact of a change
  
  With every change in your AWS environment, you need to assess the change to ensure that it has been done correctly, and then monitor to ensure that change does not happen to your resources outside of the change management process.
  
  AWS Config Aggregator Advanced Queries
  AWS Config provides the ability to query the current configuration state of AWS resources, based on configuration properties for a single account and AWS Region, or across multiple AWS accounts and Regions. You can perform as-needed, property-based queries against current AWS resource state metadata, across a list of resources that AWS Config supports.
  
  To centralize the assessment of changes made within your AWS environment, use AWS Organization integration with AWS Config, and set up AWS Config Aggregator in a Security Tooling account. For more information, refer to Multi-Account Multi-Region Data Aggregation.
  
  Important: If you are using AWS Control Tower, the AWS Config aggregator deployment is already accomplished for you on deployment of Control Tower. Reference Control Tower’s How AWS Control Tower Works documentation reviewing the audit account services for the latest details on the deployment and its associated configurations.
  
  Once the aggregator is delegated and configured, all AWS Config data is returned to the centralized delegated account. If the AWS Config service is not set up in a member account, the aggregator will not collect details for that account. The delegation of the AWS Config aggregator does not set up AWS Config in the member accounts; it only reads the AWS Config data if it is present.
  
  Using the AWS Config Aggregator, you can write custom queries against the data reported, allowing you to see a current state of all configurations. The AWS Config Aggregator data lets you assess and verify the current state of a resource configuration, to validate a change was correctly exacted. Refer to Querying the Current Configuration State of AWS Resources for details on how to create your own AWS Config advance queries in the AWS Management Console or AWS Command Line Interface (AWS CLI).
  
  CloudTrail
  AWS CloudTrail is an AWS service that helps you enable operational and risk auditing, governance, and compliance of your AWS account. Actions taken by a user, role, or an AWS service are recorded as events in CloudTrail. Events include actions taken in the AWS Management Console, AWS Command Line Interface, and AWS SDKs and APIs. These events can be filtered to identify changes or modifications that are made to your AWS resources under change control.
  
  CloudTrail is enabled on your AWS account by default, and tracks all management. When activity occurs in your AWS account, that activity is recorded in a CloudTrail event. Without creating a trail, CloudTrail stores events for 90 days. You can easily view recent events in the CloudTrail console by going to Event history. To maintain CloudTrail events beyond the default 90 days in your AWS account and to allow other services such as CloudWatch Logs to be integrated, refer to create a trail in the AWS CloudTrail documentation.
  
  Important: If you are using AWS Control Tower, the CloudTrail deployment can be accomplished for you on deployment of Control Tower. Refer to the Control Tower Monitoring Events with CloudTrail documentation for the latest details on the deployment and its associated configurations.
  
  Amazon CloudWatch
  Amazon CloudWatch is a monitoring and observability service built for DevOps engineers, developers, site reliability engineers (SREs), IT managers, and product owners. CloudWatch provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, and optimize resource utilization. CloudWatch collects monitoring and operational data in the form of logs, metrics, and events. This data can be used to monitor resource for change or adverse impacts from a change.
  
  A CloudTrail trail can send the event data to CloudWatch Logs for analyzing and detecting events that would change your AWS resources. For detailed guidance on setup for AWS CloudTrail to Amazon CloudWatch Logs, refer to the AWS CloudTrail documentation for Sending events to CloudWatch Logs. Creating alarms that look for create, update, or delete activity on AWS resources such as VPCs, AWS IAM entities, Network ACLs, Security Groups, CloudTrail changes, and so on.
  
  Track environment baseline changes
  
  All changes within the scope of your defined Change Management process should introduce planned and controlled change. Changes that are implemented outside of the Change Control process need to be identified, assessed, and reintroduced to the Change Management process to control and mitigate risks. AWS Config provides the ability to track changes and centrally report them to your stakeholders.
  
  AWS Config Delivery Channel
  
  Important: If you are using AWS Control Tower, the AWS Config deployment with change notification is already accomplished for you on deployment of Control Tower. Using the SNS topic aws-controltower-AllConfigNotifications allows for any changes within the environment to reported to a centralized source. Refer to the Control Tower Guardrail compliance notifications by SNS documentation for the latest details and configurations.
  
  As AWS Config continuously records the changes that occur to your AWS resources, it sends notifications and updated configuration states through the delivery channel, which can target your change management distribution groups. You can manage the delivery channel to control where AWS Config sends configuration updates. Notifications are sent only when AWS Config detects a change to a resource’s baseline. When using the AWS Config Delivery Channel, you should line up controlled changes being deployed with notifications of the resources changed. Additionally, when no change is scheduled to happen, the notification delivery can inform your team that changes are being made outside of the change management process. These types of out-of-band changes, if within scope and not categorized as normal change, need to be tracked down and assessed to see what actor made the change, the reason for the change, and how to include that change in the documented change process.
  
  Note: AWS Config is a Region-based service, which means you will need to deploy AWS Config to all Regions you have enabled for your AWS environment.
  
  When setting up AWS Config, you must create a configuration recorder and a delivery channel. The delivery channel requires that an S3 bucket be made available to write the historic data for persistent log storage. Instead of setting up an S3 bucket per delivery channel, deploy a centralized logging solution within your AWS environment. Refer to the Log Storage capability. The delivery channel lets you assign an SNS topic to it, and will email those changes to any email subscribed to the SNS topic. Using the [Account-Alias]-ChangeNotification created in section CF30 - S1: Establish a change scheduling and communication process, apply it to the delivery channel. All changes for any AWS Config resource being monitored for the AWS Config recorder will be emailed to members of the email distribution group.

Architecture Diagram

Implementation Resources

Establish a change management process

Define change management request fulfillment process

Recover from failed change operations

Establish mechanisms to assess, review, and monitor change

Related Content

Disclaimer

Was this page helpful?

Guidance for Change Management on AWS

Architecture Diagram

Implementation Resources

Establish a change management process

Define change management request fulfillment process

Recover from failed change operations

Establish mechanisms to assess, review, and monitor change

Related Content

Disclaimer

Was this page helpful?

Ending Support for Internet Explorer