Often when working with customers, we guide them by using AWS Budgets and related tools in the AWS platform in order to create cost and utilization guardrails. These tools can be used to conduct advanced, automated, and hands-free actions within your AWS environment – even across multiple accounts. This post will walk you through a fully automated approach to create a forecast-based mechanism in order to alert your developers when their spend is approaching a warning threshold. It will then automatically shut down their EC2 instances if their forecasted spend for the month will exceed a defined value.
This solution utilizes integrations with AWS Organizations and AWS CloudFormation in order to deploy a budget to every account in a specific organizational unit in your organization. In turn, this budget will send notifications through Amazon Simple Notification Service (SNS) when forecasted thresholds are exceeded. Then, we will utilize these SNS notifications to execute an AWS Lambda function that will shut down every EC2 instance that is not tagged as critical in a single region.
Some important notes about this solution:
- We use a CloudFormation stack as part of a multi-account organization. However, you can also use the stack in a single-account context.
- The stack presented here is not safe for production environment deployment as-is, and it is intended only for use in a development or test environment. As such, you must be careful and certain of where you deploy it.
- Utilizing a budget notification with a Lambda function creates an extensible solution that allows nearly limitless possibilities for you to create your own cost-control measures. While you can use this stack as-is, we consider it a good starting-place for far more creative solutions.
Prerequisites
There are two prerequisites for this automated solution to be deployed in accounts within an organization:
- In order for AWS Budgets to be created, use the management account of your organization to enable Cost Explorer in (see this page for guidance).
- Trusted access for AWS CloudFormation StackSets must be enabled for your organization (see this page for guidance).
About AWS Budgets
AWS Budgets lets you set custom budgets to track your cost and usage from the simplest to the most complex use cases. AWS Budgets also supports email or SNS notification when actual or forecasted cost and usage exceed your budget threshold, or when your actual Reserved Instance and Savings Plans’ utilization or coverage drops below your desired threshold.
AWS Budgets is also integrated with AWS Cost Explorer, so that you can easily view and analyze your cost and usage drivers, AWS Chatbot, so that you can receive Budget alerts in your designated Slack channel or Amazon Chime room, and AWS Service Catalog, so that you can track cost on your approved AWS portfolios and products.
Overview of a standard organization
Many customers’ AWS organizations will be similar to the diagram below, with development and production accounts split into discrete organizational units (OUs). Placing accounts into OUs that are mapped to their function lets customers create guardrails around the functionality of these accounts. Typically, these include security controls, such as blocking the provisioning of certain EC2 instance types, or creating resources in specific regions. In our example, we will utilize the Sandbox OU as the root for a budget and associated automation.
Figure 1: A typical AWS organization
Your organization will vary from this example in many ways. However, you can easily substitute a Sandbox OU for one of your own choosing.
Solution overview
AWS Budgets has two features that we will be using:
- Multiple budget alerts and thresholds can be created for each AWS account, limited at five.
- These alerts can be delivered to an SNS topic, as well as directly to an email address.
As a first step, a warning alert will be delivered to an email address when the forecast spend for an account reaches a threshold of 80%. Then, if an account is forecast to spend 100% of its budget, an email will be delivered again, as well as a Lambda function executed. In turn, this will shut down every EC2 instance in this account where the EC2 instance is not tagged as critical (in the same region where you deploy the solution).
Figure 2: Architecture diagram of our solution
Step 1: Determine your budget and thresholds
Before proceeding, you will must determine the total permissible spend per month for each AWS account. As presented in this blog, the CloudFormation stack will apply the same budget to every account in the same OU. However, this is only a starting point, and you can also adapt the solution to have per-account budgets. See Extending the solution below for more details.
You must also decide what your threshold percentages will be for warnings and budgets. You can select your threshold values, though the stack below has default values of 80% for warnings and 100% for critical values. Having a critical threshold of 200% of forecast budget is a valid approach as well, and many customers will routinely allow their teams to exceed their budgets.
Step 2: Create a service control policy
Before creating our budgets and automation, we will create a Service Control Policy (SCP) that will protect them from modification. The four parts of this policy each enforce that only the account that deployed the stack set can modify it.
- Statement1 blocks all roles except for the stack set execution role from modifying a budget.
- Statement2 blocks changes to the Lambda functions that are called by the critical budget threshold.
- Statement3 blocks changes to the SNS topics for the solution.
- Statement4 prevents a user in the account from creating their own IAM role that can modify the previous three statements. This would allow someone with broad IAM privileges to spoof the stack set owner role.
Note that Statement4 is tailored to using CloudFormation stack sets with service-managed permissions. If you wish to proceed with self-managed permissions, then adjust this stanza accordingly. Details regarding using service-managed permissions for CloudFormation are available on this page.
This SCP is applied to the OUs that you wish to attach your budgets to. You must replace the following values in it before deployment:
- Replace ACCOUNTNUMBER with the account number for the management account that deploys the stack set. This can be either the management account or a delegated administrator account. See Register a delegated administrator for more information regarding delegated accounts for CloudFormation.
- Replace STACKNAME with the name of the stack set that you will create in CloudFormation.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Deny",
"Action": [
"budgets:ModifyBudget",
"budgets:UpdateBudgetAction"
],
"Resource": [
"*"
],
"Condition": {
"StringNotLike": {
"aws:PrincipalARN": [
"arn:aws:iam::*:role/stacksets-exec-*"
]
}
}
},
{
"Sid": "Statement2",
"Effect": "Deny",
"Action": [
"lambda:DeleteFunction",
"lambda:RemovePermission",
"lambda:UpdateFunctionCode",
"lambda:UpdateFunctionConfiguration",
"lambda:UpdateFunctionEventInvokeConfig"
],
"Resource": [
"arn:aws:lambda:*:*:function:StackSet-STACKNAME-*-BudgetLambdaFunction-*"
],
"Condition": {
"StringNotLike": {
"aws:PrincipalARN": [
"arn:aws:iam::*:role/stacksets-exec-*"
]
}
}
},
{
"Sid": "Statement3",
"Effect": "Deny",
"Action": [
"sns:DeleteTopic",
"sns:AddPermission",
"sns:DeleteEndpoint",
"sns:RemovePermission",
"sns:Unsubscribe"
],
"Resource": [
"arn:aws:sns:*:*:StackSet-STACKNAME-*-CriticalTopic-*",
"arn:aws:sns:*:*:StackSet-STACKNAME-*-WarningTopic-*"
],
"Condition": {
"StringNotLike": {
"aws:PrincipalARN": [
"arn:aws:iam::*:role/stacksets-exec-*"
]
}
}
},
{
"Sid": "Statement4",
"Effect": "Deny",
"Action": [
"iam:CreateRole",
"iam:DeleteRole",
"iam:UpdateRole"
],
"Resource": [
"arn:aws:iam::*:role/stacksets-exec-*"
],
"Condition": {
"StringNotLike": {
"aws:PrincipalARN": [
"arn:aws:iam::ACCOUNTNUMBER:role/aws-service-role/stacksets.cloudformation.amazonaws.com/AWSServiceRoleForCloudFormationStackSetsOrgAdmin"
]
}
}
}
]
}
To create the service control policy, navigate to AWS Organizations, and select Policies from the left-navigation menu. Under the Supported policy types, select service control policies.
Figure 3: Selecting Service control policies
On the service control policy console, click the Create policy button to create a new service control policy.
Figure 4: Creating new policy
Enter a name and description, and paste the policy statements above to the policy editor. Then, click the Create policy button. Remember to replace the ACCOUNTNUMBER and STACKNAME with the values gathered earlier.
Figure 5: Entering policy details
Click the Create Policy button in order to complete the SCP creation.
Next, we will attach the newly created SCP to the target Development Organizational unit where we want the policy statements to be in effect. From the available policies screen, select the newly created policy by clicking the check-box on the left-hand side of the policy name. From the Actions list, select Attach policy.
Figure 6: Attaching the policy
In the following screen, we will select the Development Organizational Unit that would be the target for the policy by clicking the radio-button next to the OU name.
Figure 7: Specifying the OU to attach the policy
With this SCP created in advance, we have your budget, notifications, and Lambda protected from the moment that they are provisioned.
Step 3: Create your CloudFormation stack set
Now we can create our stack set. The actual CloudFormation stack is below. Review it carefully before deploying, and note these sections:
- Lines 26-58 create the SNS topics for warnings and alerts.
- Lines 60-87 create the actual budget and thresholds.
- Lines 89-158 create the Lambda function and subscription to the critical notification topic.
---
AWSTemplateFormatVersion: '2010-09-09'
Description: Stack that creates an AWS budget, notifications, and a Lambda function that will shut down EC2 instances
Parameters:
BudgetAmount:
Type: Number
Description: Maximum permissible spend for the month
Email:
Type: String
Description: Email address to deliver notifications to
WarningThreshold:
Type: Number
Description: Percentage of forecast monthly spend for the warning notification
Default: 80
CriticalThreshold:
Type: Number
Description: Percentage of forecast monthly spend for the critical notification
Default: 100
ShutdownExemptionTagKey:
Type: String
Description: Key name to exempt from auto-shutdown
Default: "instance-class"
ShutdownExemptionTagValue:
Type: String
Description: Value of key name tag to exempt from auto-shutdown
Default: "critical"
Outputs:
BudgetId:
Value: !Ref Budget
Resources:
WarningTopic:
Type: AWS::SNS::Topic
WarningTopicPolicy:
Type: AWS::SNS::TopicPolicy
Properties:
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sns:Publish
Resource: "*"
Principal:
Service: budgets.amazonaws.com
Topics:
- !Ref WarningTopic
CriticalTopic:
Type: AWS::SNS::Topic
CriticalTopicPolicy:
Type: AWS::SNS::TopicPolicy
Properties:
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sns:Publish
Resource: "*"
Principal:
Service: budgets.amazonaws.com
Topics:
- !Ref CriticalTopic
Budget:
Type: AWS::Budgets::Budget
Properties:
Budget:
BudgetLimit:
Amount: !Ref BudgetAmount
Unit: USD
TimeUnit: MONTHLY
BudgetType: COST
NotificationsWithSubscribers:
- Notification:
NotificationType: FORECASTED
ComparisonOperator: GREATER_THAN
Threshold: !Ref WarningThreshold
Subscribers:
- SubscriptionType: EMAIL
Address: !Ref Email
- SubscriptionType: SNS
Address: !Ref WarningTopic
- Notification:
NotificationType: FORECASTED
ComparisonOperator: GREATER_THAN
Threshold: !Ref CriticalThreshold
Subscribers:
- SubscriptionType: EMAIL
Address: !Ref Email
- SubscriptionType: SNS
Address: !Ref CriticalTopic
BudgetLambdaExecutionRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- sts:AssumeRole
Path: /
Policies:
- PolicyName: BudgetLambdaExecutionRolePolicy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
Resource: arn:aws:logs:*:*:log-group:/aws/lambda/*-BudgetLambdaFunction-*:*
- Effect: Allow
Action:
- ec2:DescribeInstances
- ec2:StopInstances
Resource: arn:aws:ec2:*:*:instance/*
BudgetLambdaFunction:
Type: AWS::Lambda::Function
Properties:
Description: Lambda function to be called after a critical budget threshold has been exceeded
Handler: index.lambda_handler
ReservedConcurrentExecutions: 1
Role: !GetAtt BudgetLambdaExecutionRole.Arn
Runtime: python3.8
Timeout: 20
Environment:
Variables:
ShutdownExemptionTagKey : !Ref ShutdownExemptionTagKey
ShutdownExemptionTagValue : !Ref ShutdownExemptionTagValue
Code:
ZipFile: |
import boto3
import os
instances = []
def lambda_handler(event, context):
ec2 = boto3.resource('ec2')
exemption_tag_key = os.environ.get("ShutdownExemptionTagKey")
exemption_tag_value = os.environ.get("ShutdownExemptionTagValue")
for instance in ec2.instances.all():
print('Found instance: {}, checking instance-class tag value...'.format(instance.id))
instance_class = 'undefined'
tags = instance.tags
for t in tags:
if t["Key"] == exemption_tag_key:
instance_class = t["Value"].lower()
print(f"Instance type is : {instance_class}")
if instance_class != exemption_tag_value:
instances.append(instance.id)
if len(instances) > 0:
print('Calling shutdown API for all discovered instance IDs')
response = ec2.instances.stop(InstanceIds=instances)
print('Raw response from shutdown API:')
print(str(response))
return True
CriticalTopicSubscription:
Type: AWS::SNS::Subscription
Properties:
Protocol: lambda
TopicArn: !Ref CriticalTopic
Endpoint: !GetAtt BudgetLambdaFunction.Arn
BudgetLambdaFunctionPermission:
Type: AWS::Lambda::Permission
Properties:
Action: lambda:InvokeFunction
FunctionName: !Ref BudgetLambdaFunction
Principal: sns.amazonaws.com
SourceArn: !Ref CriticalTopic
Deployment of this stack set is best managed using CloudFormation service-managed permissions, as this enables the automatic deployment and removal of stacks as accounts are added to OUs in AWS Organizations. Many of the options, as well as the use of features such as delegated administrator accounts, are at your discretion.
Figure 8: Selecting Service-managed permission as Stackset permission
Note that the OU ID required will be available within the Organizations console. You must copy this value into the AWS OU ID field when deploying your stack set. Up to 10 OUs can be specified per stack set, and OUs contained therein will inherit these from the parent OU.
Figure 9: Finding the OU ID
Figure 10: Specifying target OU for deployment
Clicking the Next button will show the review screen. You must acknowledge that AWS Cloud Formation will create IAM resources. Clicking Submit will create the stack-set and deploy the stack components to the accounts under the target OU. While an AWS budget is global, the actual stack can only be deployed to a single region.
Operating without shutting-down EC2 instances
This solution works well as a notification and monitoring tool, and so you can easily deploy it without the automatic shutdown of EC2 instances. This can be achieved in two ways. You can comment-out every line in the CloudFormation stack after line 88, which leaves the budget capability in place as-is, but no Lambda execution will take place. Alternatively, the CloudFormation template lets you set a tag value for EC2 instances that must be exempted from shutdown. EC2 instances that carry the tag name and tag value specified as the CloudFormation template parameters will be exempted from shutdown. The default value is set in line 23 and line 27 as instance-class/critical. This can be modified to a key/value pair that your organization follows in order to tag critical instances.
Note: The provided CloudFormation stack utilizes the “Forecasted” value to trigger the notification. If the target account is new, then it generally takes some time (typically a few days to a week) for the cost management tool to generate a “forecast” value.
Extending the solution and next steps
This CloudFormation stack is a good starting place for many customers, and it can be extended to perform any number of actions in your environment based on your need. Below are some common alterations that may be useful for you as you are implementing this solution..
First, the EC2 shutdown script can be easily extended to include Amazon RDS instances, Amazon ElastiCache, Amazon Elasticsearch Service, Amazon SageMaker notebooks, or any other number of running resources. We presented EC2 here, as it is ubiquitous, and it is a good reference for your controls. Any actions you can script with Lambda are available to you.
You may wish to have different budget thresholds for each AWS account. You can accomplish this in two ways: one is to modify the SCP to permit another IAM user or role to change the thresholds in Statement1 within your SCP, and then have that person or role change the threshold after the stack has been created. Another option is to use Systems Manager Parameter Store to keep new threshold values, reference them in the stack set, and then update the stack. This page details the embedding parameters from the Systems Manager Parameter Store.
The approach utilized here is fully compatible with AWS Control Tower and the Customizations for Control Tower solution. This solution provides a convenient way to manage the deployment of the service control policy and the stack, all in one place. Likewise, updating the stack and SCP is conducted easily through the pipeline provided by this solution.
In conclusion, utilizing a programmatic approach to controlling developer account costs is straightforward and requires little effort to manage. We recommend that all customers use AWS Budgets wherever possible in order to maintain observability regarding their cloud consumption, thereby utilizing an automatic shutdown mechanism as an evolved way to enforce your own cost control measures.
About the authors