AWS Cloud Operations Blog
Identifying resources with the most configuration changes using AWS Config
AWS Config tracks changes made to supported resources and records them as configuration items (CIs), which are JSON files delivered to an Amazon S3 bucket. These are delivered in 6-hour intervals, as configuration history files. Each file contains details about the resources that changed in that 6-hour period, for the respective resource types, such as AWS::EC2::Instance or AWS::EC2::Volume. If no configuration changes occur, no file is delivered.
With AWS Config, you’re charged based on the number of CIs recorded, and the number of active AWS Config rule evaluations or the number of conformance pack evaluations in your account. If you notice a sudden spike in the number of CIs recorded, you may want to identify the cause behind the spike. You may also want to know the number of resources that underwent the configuration changes for a desired time period.
This post provides you with a solution to extract information to audit the configuration changes in your account. The Amazon Athena integration with AWS Config can then be used to get the count of the CIs, per resource type and resource ID, for your desired time period. The following diagram illustrates this architecture.
Figure 1. Overview
For this post, you use Athena to identify the number of CIs, per month.
- The user saves the provided JSON template in a file.
- The user launches the CloudFormation(CFN) Stack using the template provided. The user credentials should have Athena Query creation and execution permissions and CFN stack creation permissions. The CFN stack will create two Athena Queries as saved queries.
- The user now navigates to the Athena Console and executes the First Query to create a table in the default database or database of their choice in the primary Workgroup. This table is created from the Config history files present in the S3 bucket, chosen as the delivery channel for the Config Recorder.
- Once the table is created, the user executes the second Athena query which is to count the number of CIs per resource type and resource ID in given time period range.
- A table is displayed with three columns: “resourcetype”,”resourceid”, and “NumberOfChanges”.
The user can further modify this second query from the Athena Query execution console to extract any other details according to the use case.
Prerequisites
Before following along with this blog post, make sure that you have the query result location set up with an S3 bucket. This bucket stores the query results for your default workgroup. For instructions, see Create a Database and steps 1–5 of Getting Started with Amazon Athena.
To avoid potential issues, add a trailing slash (/) at end of the S3 bucket name you provide.
Creating your resources
To prepare to run Athena queries, you use an AWS CloudFormation template.
- Create a file with the following JSON template:
{
"Parameters":{
"ConfigS3Bucket": {
"Description": "The S3 Bucket which stores config information and files.",
"Type": "String"
},
"DataBase": {
"Description": "The Database in athena in which you would like to create table",
"Type": "String",
"Default": "default"
},
"CreateQueryName": {
"Description": "Name of the Table Creation query",
"Type": "String",
"Default": "ConfigTableCreation"
},
"SelectQueryName": {
"Description": "Name information extraction query",
"Type": "String",
"Default": "ConfigItemCountQuery"
},
"ConfigCaptureStartTime": {
"Description": "The start time from where the Configuration item capture information to be checked",
"Type": "String",
"Default": "2020-06-01T%"
},
"ConfigCaptureEndTime": {
"Description": "The end time till where the Configuration item capture information to be checked",
"Type": "String",
"Default": "2020-06-30T%"
}
},
"Resources": {
"AthenaNamedQuery": {
"Type": "AWS::Athena::NamedQuery",
"Properties": {
"Database": { "Ref": "DataBase" },
"Description": "A query that selects all aggregated data",
"Name": { "Ref": "CreateQueryName" },
"QueryString": {"Fn::Sub": "CREATE EXTERNAL TABLE awsconfig ( fileversion string, configSnapshotId string, configurationitems ARRAY < STRUCT < configurationItemVersion : STRING, configurationItemCaptureTime : STRING, configurationStateId : BIGINT, awsAccountId : STRING, configurationItemStatus : STRING, resourceType : STRING, resourceId : STRING, resourceName : STRING, ARN : STRING, awsRegion : STRING, availabilityZone : STRING, configurationStateMd5Hash : STRING, resourceCreationTime : STRING > > ) ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe' LOCATION 's3://${ConfigS3Bucket}/AWSLogs/${AWS::AccountId}/Config/${AWS::Region}/';"}
}
},
"AthenaSelectQuery": {
"Type": "AWS::Athena::NamedQuery",
"Properties": {
"Database": { "Ref": "DataBase" },
"Description": "A query that selects all aggregated data",
"Name": { "Ref": "SelectQueryName" },
"QueryString": {"Fn::Sub": "SELECT configurationItem.resourceType, configurationItem.resourceId, COUNT(configurationItem.resourceId) AS NumberOfChanges FROM default.awsconfig CROSS JOIN UNNEST(configurationitems) AS t(configurationItem) WHERE \"$path\" LIKE '%ConfigHistory%' AND configurationItem.configurationItemCaptureTime >= '${ConfigCaptureStartTime}' AND configurationItem.configurationItemCaptureTime <= '${ConfigCaptureEndTime}' GROUP BY configurationItem.resourceType, configurationItem.resourceId ORDER BY NumberOfChanges DESC"}
}
}
}
}
The CloudFormation template only creates the Athena queries as part of the saved queries, in your Athena console. To view the appropriate results, you must run two queries manually from the console, one for table creation and other for CI count per resource type.
- On the AWS CloudFormation console, choose Create stack in the Region where your Athena and AWS Config setup exists.
- Choose With new resources (standard).
- Choose Upload a template file and choose the file you saved.
For more information about using stack templates, see Selecting a stack template.
- Choose Next.
- Provide the following parameters (you can revise them depending on your use case):
- ConfigCaptureEndTime – The end time of the period in which you’re capturing CI information. The default value is 2020-06-30T% (June 30, 2020). This value is supplied to the parameter configurationItemCaptureTime in the Athena query.
- ConfigCaptureStartTime – The start time of the period in which you’re capturing CI information. The default value is 2020-06-01T% (June 1, 2020). This value is supplied to the parameter configurationItemCaptureTime in the Athena query.
- ConfigS3Bucket – The S3 bucket that stores the AWS Config snapshot and AWS Config history files (the delivery channel). For example aws-config-bucket
- CreateQueryName – The name of the table creation query. The default value is ConfigTableCreation.
- DataBase – The database in Athena in which you create the table awsconfig to run the queries. The default value is default. It uses the default database to create the table, if not specified otherwise.
- SelectQueryName – The name of the information extraction query. The default value is ConfigItemCountQuery.
- Choose Next.
- In the next section, provide an AWS Identity and Access Management (IAM) role with the permissions to create Athena resources in the stack.
Make sure the IAM entity used to create the stack has the appropriate permissions for:
- Athena – see AmazonAthenaFullAccess Managed Policy.
- AWS CloudFormation – see Controlling access with AWS Identity and Access Management.
- Amazon S3 (for the AWS Config delivery channel) – see Bucket owner granting its users bucket permissions.
- If you’re using cross-account S3 buckets as your delivery channel, make sure to provide cross-account Amazon S3 permissions for your IAM entity. For more information, see Create a bucket, a IAM user, and add a bucket policy granting IAM user permissions.
- Choose Next.
- Choose Create stack.
After stack creation, you should see two resources AthenaNamedQuery and AthenaSelectQuery created, with the stack status as CREATE_COMPLETE:
Running queries
After you launch the CloudFormation stack, you can run queries to get the count results.
- Open the Athena console in the same region, for example “us-east-1”, where you created the CloudFormation stack.
- On the Saved queries page, select ConfigTableCreation.
- In the query editor, choose Run query.
This creates a table named “awsconfig” in your default database.
For this post, the S3 bucket path in the table creation query is :
s3://ConfigS3Bucket/AWSLogs/Current_stack_Account_ID/Config/current_stack_region/
You can change the account ID and Region according to your use case.
For example :
s3://aws-config-bucket/AWSLogs/123456789012/Config/us-east-1/
- On the Saved queries page, select ConfigItemCountQuery.
- In the query editor, choose Run query.
This provides the CI counts per resource type and resource ID.
You can modify the value for configurationItemCaptureTime in the query as per your requirements.
When comparing the total number of CIs between the Athena query results and AWS billing data, for the same month and Region, a discrepancy can occur. The data Athena queried can cross day boundaries and include CIs billed in adjacent months. AWS Config CIs are metered based on when configurationItemCaptureTime time was initiated.
Setting up AWS billing alarms
In addition, you can monitor your estimated AWS charges using Amazon CloudWatch and create a billing alarm. The billing alarm triggers if your account billing exceeds the threshold you specify. For instructions, see Creating a Billing Alarm to monitor your estimated AWS charges.
Conclusion
You can run the preceding queries to identify the count, per resource type and per resource ID, to identify the top talkers for your AWS Config service cost. You can also add more customizations to your Athena query according to your use case. For more information, see How to query your AWS resource configuration states using AWS Config and Amazon Athena.
Combining the benefits of Athena with AWS Config provides a way to identify and manage configuration changes for your AWS resources. We hope this post has demonstrated this to you and helps you manage your AWS resources.
About the Authors
Sushma is Cloud Support Engineer at AWS. She is a multi-domain enthusiast with a vast experience in the field of cyber-security and is passionate about coding solutions to automate processes. She is motivated to innovate ideas which simplifies solutions and develop articles with gotchas for the customers. Outside of work she is a painter, dancer, enjoys a good Netflix binge but can also be found on long bike rides on hilly country roads.
Attalla is a Cloud Support Engineer. He specializes in AWS Config, Logging and Monitoring AWS services. Attalla is passionate about building secure and effective solutions for AWS customers. He enjoys deep dive into compliance related challenges, writing articles and building tutorials. Outside of work, Attalla’s favorite activities include snorkeling and reading. He loves spending time with his daughters.
Shreejesh is a Cloud Support Support Engineer. Learning is his passion and he holds various AWS certifications and Masters in Business Administration. He provides customers with technical guidance to implement networking and security best practices.Outside of work, Shreejesh plays badminton and spends time with his family.
Deren is a System Engineer for AWS Managed Services (AMS). He likes tackling difficult problems, developing scalable and automated solutions, and analyzing datasets to improve processes. Outside of work, he enjoys playing board games with friends and family. He also loves pizza, even if there is pineapple on it.