AWS Cloud Operations & Migrations Blog

Automate insights for your EC2 fleets across AWS accounts and regions

Introduction

Gaining insights and managing large Amazon Elastic Compute Cloud (Amazon EC2) fleet that is spread across multiple accounts and regions can be a challenging task. It’s crucial to have a quick and efficient method to identify which instances are managed by AWS Systems Manager (SSM) and gather detailed information about the instances that are not under SSM’s management. Gathering this detail manually is tedious.

In this blog post, we will discuss how Delhivery (a fully-integrated logistics provider in India) collaborated with AWS Enterprise Support to co-create a cost-effective, scalable and robust automated solution for EC2 instance details discovery. We will walk you through the process of setting up AWS Lambda and Systems Manager based automation to gain visibility into your EC2 instances. By leveraging this combination, you can quickly and easily generate comprehensive reports on your EC2 instances, including essential details such as instance ID, operating system (OS) name, OS version, and whether the instance is managed by SSM. The procedures outlined here are designed to support customers across multiple regions and accounts or within an AWS Organizations.

Customer Summary

Delhivery is India’s largest fully integrated logistics provider. Their mission is to help customers operate flexible, reliable, and resilient supply chains at the lowest cost. They are also building an operating system for e-commerce by combining world-class infrastructure, high-quality logistics operations, cutting-edge engineering and technology capabilities.

Business Challenge

As an enterprise customer with numerous AWS accounts, Delhivery faced the following challenges:

  1. Efficiently managing a large fleet of EC2 instances across multiple AWS accounts and regions to ensure timely patching and upgrades of underlying OS to mitigate security and operational risks.
  2. Reducing manual efforts required to identify EC2 instances not managed by SSM and enabling them for upgrades, patching, and other maintenance activities.

To address these challenges, Delhivery engaged their Technical Account Manager (TAM) and premium support subject matter experts to develop a tailored solution. The goal was to automate the process of obtaining operating system details across their AWS Organizations, facilitating efficient patch management, OS upgrades, and identification of unmanaged EC2 instances. Through collaboration with enterprise support, they successfully built an automated solution that met their requirements.

Prerequisites

  • An Amazon Simple Storage Service (Amazon S3) bucket that will be used to store the report. The S3 bucket should be created in same account where AWS CloudFormation template is deployed (Step 1).
  • It is recommended that you create an S3 bucket ahead of implementation.

Solution overview

The implementation of this automated solution has provided Delhivery with several key benefits, as outlined below.

  1. It simplified the process of obtaining operating system details across their entire organization, enabling streamlined patch management and timely OS upgrades. This not only improved system security but also ensured compliance with end-of-life support for operating systems.
  2. Identifying the number of instances that are not managed via Systems Manager, Helped Delhivery determine the required actions to make these instances managed via Systems Manager. Once managed, they can leverage the various capabilities of Systems Manager such as Patch Manager, Run Command, Session Manager, and more.
  3. Delhivery created an organization-wide EC2 inventory, enabling them to upgrade old instances, manage spot instances, and utilize tag-based filters. The inventory improved operational efficiency and decision-making.
  4. Map the End-of-Life (EOL) dates for different operating systems. This analysis provided delhivery with insights about the support timelines for each operating system, enabling them to plan and schedule timely upgrades accordingly.

You can use this solution to obtain a list of all your EC2 instance details within a few minutes, spread across multiple regions and accounts. This allows customers to conveniently find and verify the details of their instances.

This process involves a AWS Lambda function in the delegated account/ management account assuming a role named “ssmLambdaRole” in each account of an AWS Organizations. The Lambda function executes the necessary AWS API calls, including ec2:DescribeInstances, ec2:DescribeImages, ec2:DescribeRegions and ssm:DescribeInstanceInformation, to gather the required details of the instances. Once the details are collected, the Lambda function uploads a CSV report to an S3 bucket. Any member account can be designated as a delegated account in an organization.

Architecture diagram showing how Lambda function collects data from multiple accounts and store it in S3 bucket, which can be later processed by Quicksight and Athena

Figure 1. Architectural design

Step 1 : Create an AWS Identity and Access Management (IAM) role named LambdaSsmOsDetailRole and lambda function named LambdaSsmOsFunction in delegated account or management account for Lambda service.

Deploy the following CloudFormation template Step1.yaml. See Creating a stack on the AWS CloudFormation console for more details.

Download the Step1.yaml CloudFormation template. This stack will create an IAM role “LambdaSsmOsDetailRole” and policy “LambdaSsmOsDetailPolicy” with the specified permissions to list AWS accounts in the organization, perform S3 PutObject operations, and assume the ssmLambdaRole role. Please make sure to specify the S3 bucket where you want to upload the final report.

If you have deployed the Step 1 template into the management account, then proceed to Step 3, otherwise continue to Step 2

Step 2: (Optional) Create IAM role for delegated account

Note: This step is optional and should be executed in case the report needs to be collected from an account of organization which is not management account.

In case you want to collect the report in management account of organization, then please proceed to Step 3.

Deploy the following CloudFormation template Step2.yaml. For details regarding the process to deploy CloudFormation template, please see, Creating a stack on the AWS CloudFormation console. Download the Step2.yaml CloudFormation template.

The stack will create an IAM role “ManagementOrganizationRole” and policy “ManagementOrganizationPolicy” with the specified permissions to list AWS accounts in the organization. This role will be assumed by Lambda function from delegated account of next step that collects the data and generates report.

Step 3 : Create an IAM role named ssmLambdaRole in each account of organization for Lambda

Download the  Step3.yaml CloudFormation template. This CloudFormation template will create an IAM policy named “MyPolicy” and an IAM role named “MyRole” with the specified permissions and trust policy, respectively.  Please make sure to replace <Parent/Payer Account id> with the actual ID of the parent/payer account. Please note that the AWS::AccountId pseudo parameter is only available within the AWS CloudFormation service. If you’re deploying the stack using an external tool or framework, you may need to find an equivalent method to retrieve and substitute the account ID.

Use the above cloud formation template for all the target account role creation to create a stack set with service-managed permissions. You can use stack sets to create the stack for your entire organization or specify the OUs that you want.

To create a stack set for your organization

  1. Sign in to the AWS Management Console as the management account for your organization.
  2. Open the AWS CloudFormation console.
  3. If you haven’t already, in the Region selector, choose the same AWS Region that you used in the previous procedure.
  4. In the navigation pane, choose StackSets.
  5. Choose Create StackSet.
  6. On the Choose a template page, keep the default options for the following options:
    • For Permissions, keep Service-managed permissions.
    • For Prerequisite – Prepare template, keep Template is ready.
  7. Under Specify template, choose Upload a template file, and then select Choose file.
  8. Choose the file and then choose Next.
  9. On the Specify StackSet details page,
    • Enter a stack name such as LamdaSSMRole,
    • Enter a description for parameter SsmLambdaRoleName specify the same name that you specified in Step1.yaml for parameter SsmLambdaRoleName,
    • For parameter DeploymentAccountID specify the ID of the account where you deployed the Step1.yaml and then choose Next.
  10. On the Configure StackSet options page, keep the default options and then choose Next.
  11. On the Set deployment options page, for Add stacks to stack set, keep the default Deploy new stacks option.
  12. For Deployment targets, choose if you want to create the stack for the entire organization or specific OUs. If you choose an OU, enter the OU ID.
  13. For Specify regions, enter only one of region because IAM is a global service so deploying into multiple regions will result in a failure
  14. For Deployment options, for Failure tolerance – optional, enter the number of accounts where the stacks can fail before CloudFormation stops the operation. We recommend that you enter the number of accounts that you want to add, minus one. For example, if your specified OU has 10 member accounts, enter 9. This means that even if CloudFormation fails the operation 9 times, at least one account will succeed.
  15. Choose Next.
  16. On the Review page, review your options, and then choose Submit. You can check the status of your stack on the Stack instances tab.

After CloudFormation creates the stacks, each member account can sign in to the Support Center Console and find a role is created.

If you are using AWS Organizations to deploy stack-set, then you need will to deploy the same CloudFormation template as a simple stack in the same region to allow the creation of similar IAM role in the management account as well.

If you do not have an organization, you need to deploy the CloudFormation template in each account to gather details of EC2 instances from each account.

Since stack-set do not deploy the CloudFormation template in the management account, please deploy the step3.yaml CloudFormation template as normal stack. For details regarding the process to deploy CloudFormation template, please see Creating a stack on the AWS CloudFormation console.

Step 4: Configure test event for Lambda function

A Lambda function named LambdaSsmOsFunction is created in account where CloudFormation template from Step 1 is deployed. Using this function, we will get the details of all the EC2 instances across different region and accounts, based on the test event details.

The format required for test event is

{
    "region": [],
    "accountids": [],
    "bucket": "bucket-name",
    "managementAccountId": ""
}

For region, you can mention the list of regions from you want to get the details. To get details from all region, please keep it as empty array.

For accountids, you can mention the list of accounts from you want to details.

For bucket, you need to mention the S3 bucket name where the report needs to be saved. The report will be saved as ‘ssm/<YYYY-MM-DD>/instanceReport.csv’ in S3 bucket. Please make sure to specify the same bucket, that is specified during the execution of CloudFormation template in Step 1.

Here are some examples which demonstrate how to get required details from specific account or region using Lambda function.

Example 1 : To get details from all the account belonging to the organization from all regions, when executing the function from a sub-account, we need to pass blank value for accountids and region, and account id of management account under managementAccountId such as

{
  "region": ["ap-south-1","ap-southeast-1"], 
  "accountids": [],
  "bucket": "bucket-name",
  "managementAccountId": "9999999999"
}

Example 2 : To get details from account 1111111111 and 2222222222 from Mumbai and Singapore region, when executing the function from management account, we can set following values in the event

{
  "region": ["ap-south-1","ap-southeast-1"], 
  "accountids": ["1111111111","2222222222"],
  "bucket": "bucket-name",
  "managementAccountId": ""
}

Example 3 : To get details from account 1111111111 and 2222222222 from all the region, , when executing the function from management account, we can set following values in the event

{
  "region": [],  
  "accountids": ["1111111111","2222222222"],
  "bucket": "bucket-name"
  "managementAccountId": ""
}

Example 4 : To get details from all the account belonging to the organization from Mumbai and Singapore region, when executing the function from management account, we need to pass blank value for accountids and managementAccountId such as

{
  "region": ["ap-south-1","ap-southeast-1"], 
  "accountids": [],
  "bucket": "bucket-name"
  "managementAccountId": ""
}

Example 5 : To get details from all the account belonging to the organization from Mumbai and Singapore region , when executing the function from a sub account, we need to pass blank value for accountids and account id of management account under managementAccountId such as following

{
  "region": ["ap-south-1","ap-southeast-1"], 
  "accountids": [],
  "bucket": "bucket-name"
  "managementAccountId": "9999999999"
}

Save the test event as per requirement and execute the Lambda function. The repost will be stored in S3 bucket passed under test event as ssm/<YYYY-MM-DD>/instanceReport.csv

Step 5: (Optional) Set the run schedule for the Lambda function.

You can configure this lambda function to run weekly or at your desired frequency using EventBridge and cron syntax. Refer to the below steps:

  1. Open the AWS Management Console and go to the EventBridge service.
  2. Click on “Rules” and select the rule that triggers your Lambda function or create a new rule.
  3. Configure the rule with a name, description, and state (enabled).
  4. Select “Schedule” as the rule type and use a cron expression for the desired interval.
  5. Add your existing Lambda function as the target.
  6. Configure the target with the necessary settings, including an empty JSON object as the input.
  7. Assign the appropriate IAM role to the rule to ensure the Lambda function has the necessary permissions.
  8. Save the rule, and your existing Lambda function will now run according to the specified cron expression.

(Optional) Extend capabilities for improved analysis and visualizations

By combining Amazon Athena (serverless query service that enables ad hoc analysis) and Amazon QuickSight (business intelligence tool that offers data visualization and reporting capabilities), you can effortlessly analyze CSV reports in S3 using Athena’s querying capabilities and then create interactive visualizations and dashboards in QuickSight, allowing for data-driven insights.

Ad-hoc analysis using AWS Athena

You can perform ad-hoc analysis on the data obtained in the CSV file created in Amazon S3, refer to the below steps for configuring Athena for ad-hoc query on the CSV data:

  1. Open the AWS Management Console and navigate to the Athena service.
  2. In the Athena Query Editor, switch to the SQL Editor view.
  3. Execute the following SQL statement to create a new database:
sql
CREATE DATABASE your_database_name;

Replace your_database_name with the desired name for your database.
Once the SQL statement is executed successfully, the database will be created.

  1. To verify the creation of the database, you can execute the following SQL statement to list all the databases:
sql
SHOW DATABASES;
  1. In Athena, navigate to the “Query Editor” and execute the following SQL statement to create an external table that points to your CSV file in S3:
sql
CREATE EXTERNAL TABLE IF NOT EXISTS your_table_name (
column1 datatype1,
column2 datatype2,
...
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LOCATION 's3://your-bucket-name/your-folder-path/';

Replace your_table_name with the desired name for your table, specify the column names and their corresponding data types, and provide the S3 location of your CSV file in place of ‘s3://your-bucket-name/your-folder-path/‘.

  1. Once the table is created, you can run ad hoc queries on the data in Athena using SQL. For example:
sql
SELECT * FROM your_table_name WHERE column1 = 'some_value';

Replace your_table_name with the actual name of your table and adjust the query as needed to filter, aggregate, or perform other operations on the data.

Execute the query and view the results in the Athena query results view. You can export the results or integrate Athena with other AWS services for further analysis if desired.

Remember to configure the necessary permissions for S3 and Athena, and ensure that you have the required access to perform these steps successfully.

Visualization using BI tools – QuickSight with Athena table

Here are the steps to create a QuickSight analysis and dashboard using an Athena table:

  1. Ensure you have set up and configured Amazon Athena and QuickSight in your AWS account.
  2. In the QuickSight home page, click on “New analysis” to start creating a new analysis.
  3. In the “Choose a data source” step, select “Athena.”
  4. Select the Athena database that contains the table you want to use for analysis.
  5. Choose the specific table from the database that you want to analyze.
  6. QuickSight will detect the table schema and provide a preview of the data.
  7. Click on “Visualize” to proceed to the analysis builder.
  8. In the analysis builder, you can drag and drop fields from the table onto the canvas to create visualizations. Choose appropriate chart types, filters, and aggregations based on your data and analysis requirements.
  9. Customize the visualizations by modifying colors, labels, legends, and other settings.
  10. Add additional visualizations as needed to build a comprehensive analysis.
  11. Once you’re satisfied with the analysis, save it.
  12. To create a dashboard, go to the QuickSight home page and click on “New dashboard”
  13. In the dashboard builder, drag and drop the saved visualizations from the analysis onto the canvas.
  14. Customize the layout, add text boxes, images, and other elements to create an interactive and informative dashboard.
  15. Configure filters and other interactivity options to enable dynamic data exploration.
  16. Save and publish the dashboard to make it accessible to other QuickSight users.

This example visual report as shown in Figure-2 helps you visualize the distribution of managed and unmanaged instances across regions and accounts, providing valuable information for decision-making and further actions.

Athena sample image

Figure 2. Count of Records by Region, AccountId, and Information Source.

Column SSM means that the instance is managed via Systems Manager and EC2 means that it is not managed via Systems Manager.

Cost Estimation

There will be only charge for using Lambda function and for output file stored in S3 bucket.

For details on pricing of using Lambda, please see AWS Lambda Pricing.

For details on pricing of storing an object in S3, please see Storage Pricing.

Cleanup

In this blog, we created multiple resource via AWS CloudFormation. It is easy to clean up everything in the same way it was created.

In Step 3, AWS CloudFormation stack-set was created. To delete the resources belonging to CloudFormation stack set, please see Delete a stack set using the AWS Management Console

After the AWS CloudFormation stack set is cleaned up, you can proceed with the clean up of resources created as AWS CloudFormation stacks. Please delete the CloudFormation stacks in following sequence:

  1. Step 3
  2. Step 2
  3. And finally, Step 1.

For more details, please see Deleting a stack on the AWS CloudFormation console.

Conclusion

To summarize, the automation eliminated the need for manual data collection and analysis, enabling Delhivery’s teams to focus on more strategic initiatives. Moreover, the reusable nature of the automation meant that the solution could be applied across multiple AWS accounts, providing consistent and efficient management of virtual instances. This scalability and reusability further contributed to time and cost savings for Delhivery.

For more details on enterprise support services, visit our public documentation to learn and unlock consultative guidance tailored to your applications and use-cases, maximizing value from AWS.

AWS SERVICES: AWS Enterprise Support, AWS Systems Manager, AWS Lambda, Amazon Simple Storage Service (Amazon S3)Amazon Elastic Compute Cloud (Amazon EC2)AWS Identity and Access Management (IAM), AWS Organizations

About the authors:

Ankit Agrawal

Ankit Agrawal is a Senior Technical Account Manager at AWS. He specializes in cloud solutions. He provides customers with in-depth guidance on AWS services, leveraging his knowledge of cloud architecture and best practices. Ankit collaborates with cross-functional teams to ensure seamless project execution with operational excellence.

Vinay Srivastava

Vinay Srivastava is a Cloud Support Engineer at AWS Premium Support from Linux team. He specializes in automation. In his role, he enjoys helping customers to provide simple solution to complex scenario via automation. In his free time, he enjoys spending time with friends and family.

Vinay Mishra

Vinay Mishra is an Engineering Manager for DevOps at Delhivery. He is passionate about Automation and tries to inculcate his learnings in implementing the same to have readily available data in a Multi Account, Multi Region setup to save on the team’s bandwidth which ultimately optimizes the regular Cloud Operations tasks.