How can I automatically evaluate and remediate the increasing volume on an Amazon EC2 instance when free disk space is low?

Last updated: 2022-05-17

I want to see if the volumes attached to my Amazon Elastic Compute Cloud (Amazon EC2) instances need to be extended. Also, extending partitions and file systems at the operating system (OS) level is a time-consuming operation. How can I automate the whole process?

Short description

You can use a set of AWS Systems Manager Automation documents to evaluate and extend Amazon Elastic Block Store (Amazon EBS) volumes. The Automation documents work in unison, allowing you to investigate and optionally remediate low disk usage on an Amazon EC2 instance.

The AWSPremiumSupport-TroubleshootEC2DiskUsage Automation document orchestrates the run of the other Systems Manager documents, based on the OS type.

The first set of documents performs basic diagnostics and evaluation whether it’s possible to migrate by expanding the volume size:

  • AWSPremiumSupport-DiagnoseDiskUsageOnWindows
  • AWSPremiumSupport-DiagnoseDiskUsageOnLinux

The second set of documents takes the output of the first document and runs Python code to perform the volume modification. Then, the automation accesses the instance and extends the partition and file system of the volumes:

  • AWSPremiumSupport-ExtendVolumesOnWindows
  • AWSPremiumSupport-ExtendVolumesOnLinux

Use the following steps to set up the required permissions and run the Automation document.

Resolution

Grant permissions

You must grant the following permissions to use the Automation documents.

If you haven't already done so, create an AWS Identity and Access Management (IAM) instance profile for Systems Manager. Then, attach it to the target instance.

To set up the AssumeRole, which is required to specify the AutomationAssumeRole parameter during the Automation document configuration process, follow these steps:

1.    Create a policy on the JSON tab using the following JSON policy document:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": [
        "ec2:DescribeVolumes",
        "ec2:DescribeVolumesModifications",
        "ec2:ModifyVolume",
        "ec2:DescribeInstances",
        "ec2:CreateImage",
        "ec2:DescribeImages",
        "ec2:DescribeTags",
        "ec2:CreateTags",
        "ec2:DeleteTags"
      ],
      "Resource": "*",
      "Effect": "Allow"
    },
    {
      "Action": [
        "iam:PassRole"
      ],
      "Resource": "*",
      "Effect": "Allow"
    },
    {
      "Action": [
        "ssm:StartAutomationExecution",
        "ssm:GetAutomationExecution",
        "ssm:DescribeAutomationStepExecutions",
        "ssm:DescribeAutomationExecutions"
      ],
      "Resource": "*",
      "Effect": "Allow"
    },
    {
      "Action": [
        "ssm:SendCommand",
        "ssm:DescribeInstanceInformation",
        "ssm:ListCommands",
        "ssm:ListCommandInvocations"
      ],
      "Resource": "*",
      "Effect": "Allow"
    }
  ]
}

2.    Create the assume role and attach the policy created in the previous step.

3.    Modify this statement and replace "Resource": "*", with your ARN for the assume role.

{
  "Action": [
      "iam:PassRole"
    ],
    "Resource": "*",
    "Effect": "Allow"
  },

Run the Automation document

To use the set of Systems Manager Automation documents, you need to run only the initial AWSPremiumSupport-TroubleshootEC2DiskUsage document. Follow these steps:

1.    Open the Systems Manager console, and then choose Automation from the navigation pane.

2.    Choose Execute automation.

3.    Select the radio button for AWSPremiumSupport-TroubleshootEC2DiskUsage, and then choose Next.

4.    For Execute automation document, select Simple execution.

5.    Under Input parameters:

For InstanceId, enter your Amazon EC2 instance ID.

For AutomationAssumeRole, enter the ARN of the role that allows the Automation to perform the actions on your behalf. This is the assume role that you created when granting permissions.

6.    (Optional) Under Input parameters, specify the following inputs if your requirements differ from the default values:

VolumeExpansionEnabled: Controls whether the document will extend the affected volumes and partitions (default: True)

VolumeExpansionUsageTrigger: Minimum percentage of used partition space required to trigger expansion (default: 85)

VolumeExpansionCapSize: Maximum size in GiB that the EBS volume will increase to (default: 2048)

VolumeExpansionGibIncrease: Volume increase in GiB (default: 20)

VolumeExpansionPercentageIncrease: Volume increase in percentage (default: 20)

7.    Choose Execute.

The console displays the Automation status.

Example

Your current volume is 30 GB and has 4 GB free, which means that you have 26 GB of used space. You specify the following input parameters:

  • VolumeExpansionUsageTrigger: 85
  • VolumeExpansionGibIncrease: 10
  • VolumeExpansionPercentageIncrease: 15
  • VolumeExpansionCapSize: 2048

Outcome:

The increase triggers, because 26 GB of used space is above the 85% threshold specified for VolumeExpansionUsageTrigger.

The volume increased by 10 GB. This is because you specified that the volume should increase by either 10 GB or by 15% of the current volume size of 4.5 GB. The Automation document uses the biggest net increase between VolumeExpansionGibIncrease and VolumeExpansionPercentageIncrease.

The new volume size is 40 GB, which is within the specified 2048 VolumeExpansionCapSize.