AWS Storage Blog

Configuring the auto-expansion of Amazon FSx for OpenZFS with Amazon CloudWatch and AWS Lambda

Today’s demanding workloads such as database, rendering farm, analytics and ML workloads, have increasingly demanding IO requirements. These workloads need a reliable storage infrastructure that provides sufficient storage capacity, IOPS, and throughput. As customers move more workloads to the cloud, they want to benefit from the agility and performance capabilities of the cloud as their workloads and business grow. To realize the full benefits of the cloud, storage needs to expand elastically as data grows or your storage system might reach full capacity resulting in performance bottlenecks and production interruptions.

To support these workloads, many users rely on OpenZFS to manage large quantities of data. Amazon’s implementation of the popular OpenZFS file system delivers additional benefits including automation tools to expand the file system as your needs grow, Amazon FSx for OpenZFS is a fully-managed file storage service that makes it easy to quickly provision shared file storage or re-platform data storage residing on on-premises ZFS. FSx for OpenZFS enables customers to onboard to AWS without changing their application file access methods or how they manage data access of their applications. Amazon FSx for OpenZFS gives you the ability to scale your capacity, bandwidth, and IOPS independently in a decoupled manner when and where you need it. FSx for OpenZFS is monitored by Amazon CloudWatch. CloudWatch provides you with data and actionable insights to monitor your applications, respond to system-wide performance changes, and optimize resource utilization. AWS Lambda is a serverless, event-driven compute service that lets you run code for virtually any type of application or back-end service without provisioning or managing servers.

In this blog I will start by discussing the benefits of Amazon FSx for OpenZFS. Then I will show you how to monitor your file system with Amazon CloudWatch, and finally I will take you through an example on how to configure a CloudWatch template and use AWS Lambda to automate the expansion of your file system based on user-configured conditions. This automated auto-expansion solution will ensure that your Open ZFS file system runs at optimized performance, and without issues caused by a lack of storage headroom.

Amazon FSx for OpenZFS benefits

Amazon FSx for OpenZFS offers a cost-effective solution for ZFS or other Linux file servers that need NFS accessibility. Amazon FSx for OpenZFS provides several features that help you optimize costs. The strategies that you can use to manage the costs of Amazon FSx for OpenZFS, include:

  • Amazon FSx for OpenZFS enables you to independently set your storage and performance capacity. This provides you with the ability to provision only the storage capacity, throughput, and SSD IOPS that you need, and scale these at any time to adapt as your needs evolve.
  • Amazon FSx for OpenZFS supports Z-Standard and LZ4 compression technologies. Enabling compression on workloads that benefit from data compression helps you reduce your storage and backup costs.
  • Amazon FSx for OpenZFS supports multiple data containers (volumes) per file system, thin provisioning, and user or group storage quotas, enabling you to efficiently support multiple teams, applications, and use cases within a single file system.
  • You can set user storage quotas on your file systems to limit the data storage that users can consume. For more information, see the documentation on updating a volume configuration.
  • Amazon FSx for OpenZFS provides storage-efficient, near instant, point-in-time volume snapshots that are stored directly within your file system.

Monitoring Amazon FSx for OpenZFS

File systems running IO intensive workloads should be continuously monitored to ensure that they have sufficient capacity at all times. You can define a mechanism to monitor the utilization of your file system and automatically expand the storage capacity as it gets close to full capacity. I will take you through a process to automate storage capacity growth utilizing common AWS services such as Amazon CloudWatch, and AWS Lambda. The benefits include automating the elastic expansion of your file system to ensure the performance and reliability of your workload as it grows to accommodate your business.

To help you manage the file system, Amazon CloudWatch alarms can be set up to notify storage administrators when file system storage utilization reaches a specific threshold. These alarms might be overlooked by storage administrators which can result in a file system reaching full capacity, or performance bottlenecks resulting in production interruptions.

The solution proposed in this blog automates the process of expanding the file system when a threshold has been reached using Amazon CloudWatch Alarms, Amazon Simple Notification Service (Amazon SNS), and AWS Lambda.

When updating the storage of an FSx for OpenZFS file system, please consider the following:

  • A file system cannot be scaled until 6 hours has elapsed since the last IOPS and/or capacity scaling event. This is sometimes referred to as a cooldown period.
  • If the file system is using “User-provisioned IOPS mode”, then the number of IOPS must be greater than or equal to 3 IOPS per GiB of storage capacity – The Lambda function addresses this requirement.
  • The file system must be increased by a minimum of 10% of the file system’s current storage capacity.
  • The storage capacity can only be increased, but it cannot be decreased.
  • At the time of this writing, the maximum size of an FSx for OpenZFS file system is 512 TiB.

It is important to note the preceding considerations when defining the percentage by which to increase the storage capacity to ensure that enough storage is provisioned between cool-down periods. Please see this documentation for further storage scaling considerations.

Amazon FSx for OpenZFS auto-expansion solution architecture

The following, “Amazon FSx for OpensZFS auto-expansion solution architecture” diagram shows the solution architecture and components utilized for the automation.

The architecture diagram presents the following flow:

  1. An Amazon FSx for OpenZFS filesystem reports UsedStorageCapacity and StorageCapacity metrics to Amazon CloudWatch.
  2. CloudWatch uses these reported metrics to calculate the percentage of utilized file system storage using the following formula: 100 * (UsedStorageCapacity / StorageCapacity). If the result of this formula exceeds the defined used storage threshold a CloudWatch alarm is triggered.
  3. The CloudWatch alarm will trigger an action that will send a notification using Amazon SNS to AWS Lambda.
  4. The AWS Lambda function defined will increase the storage capacity of the Amazon FSx for OpenZFS by the pre-defined percentage. If the file system IOPS mode is User Provisioned then the function will also check and ensure that the defined IOPS are greater than or equal to 3 IOPS per GiB of storage capacity. Finally, the function will set the alarm state to OK.
  5. Amazon SNS sends a notification to the customer that a storage capacity upgrade has been processed.

Configuring the auto-expansion of Amazon FSx for OpenZFS with Amazon CloudWatch and AWS Lambda

Amazon FSx for OpensZFS auto-expansion solution architecture

Scenario and automation workflow

Let’s take a look at how the solution works in a real-world environment. For the purposes of our example, the following conditions show the workflow and outcome that we expect to deliver with the solution:

  1. A user has provisioned an Amazon FSx for OpenZFS file system with 1 TiB of storage capacity.
  2. The user then deploys the provided CloudFormation template to manage the storage capacity of the files system with the following values:
    • FileSystemId: File system ID of Amazon FSx for OpenZFS file system provisioned in 1.
    • UsedStorageCapacity: 80%
    • EmailAddress: exampleuser@example.com
    • PercentIncrease: 15%
    • MaxFSxSizeinGiB: 524, 288 GiB
  3. When the file system reaches 820 (80% of 1024 Gib) used storage capacity a CloudWatch alarm is triggered, and an email is sent to exampleuser@example.com informing them that the used storage capacity has reached the 80% utilized threshold.
  4. In addition, the CloudWatch alarm triggers a Lambda function to increase the storage capacity by 15% from 1024 GiB to 1178 GiB and subsequently sends an email to exampleuser@example.com informing them that the storage capacity of the file system has been increased.

Automating Amazon OpenZFS expansion with the AWS CloudFormation template

To automate deploying the components that are used to automatically increase the storage capacity of an FSx for OpenZFS file system, we use a Amazon CloudFormation template. Upon completing the template, CloudFormation provisions and configures the resources for you so that you don’t have to individually create and configure them and determine resource dependencies. It is important to note that this solution will only manage the storage capacity of the FSx for OpenZFS file system specifically provided as an input in the “File system ID” field. Also note that this solution will work with both automatic or user-provisioned IOPS mode file systems.

Please ensure that the following requirements are met before deploying the CloudFormation template:

  • You must be logged into the AWS Management Console with a user with sufficient permissions.
  • You must deploy the CloudFormation template in the same Region as the file system for which you would like to automate storage capacity increases.

Launch the automatic storage capacity increase solution stack

Now let’s configure and deploy an AWS CloudFormation stack to automatically increase the storage capacity of an FSx for OpenZFS file system. It takes a few minutes to deploy. For more information about creating a CloudFormation stack, see the documentation on creating a stack on the AWS CloudFormation console.

  1. Download the FSxOpenZFSDynamicStorageScaling AWS CloudFormation template.
  2. In Specify stack details, enter the values for your automatic storage capacity increase solution.
  3. Enter a Stack name.
  4. For Parameters, review the following parameters for the template and modify them for the needs of your file system. Then choose Next.

    File system ID
    Default Value: No default value.The ID of the file system you want to automatically increase the storage capacity.Threshold
    Default Value: No default value.Specifies the used storage capacity at which to trigger an alarm and automatically increase the file system’s storage capacity, specified in percentage (%) of the file system’s current storage capacity. The file system is considered to have low free storage capacity when the used storage exceeds this threshold.Percent Capacity IncreaseDefault Value: 20%.

    Specifies the amount by which to increase the storage capacity, expressed as a percentage of the current storage capacity.

    Note: Do not specify a value lower than 10%.

    Email address

    Default Value: No default value.

    Specifies the email address to use for the SNS subscription and receives the storage capacity threshold alerts.

    Maximum supported file system storage capacity (DO NOT MODIFY)

    Default Value: 524,288.

    Specifies the maximum supported storage capacity for the storage.

    Note: Do not change this value.

  5. Enter any Options settings that you want for your custom solution, and then choose Next.
  6. For Review, review and confirm the solution settings. You must select the check box acknowledging that the template creates IAM resources.
  7. Choose Create to deploy the stack.

To receive notifications about the actions that are performed, as a response to the CloudWatch alarm, you must confirm the Amazon SNS topic subscription by following the link provided in the Subscription Confirmation email.

You can view the status of the stack in the AWS CloudFormation console in the Status column. You should see a status of CREATE_COMPLETE in a few minutes.

Updating the stack

After the stack is created, you can update it by using the same template and providing new values for the parameters. For more information, see Updating stacks directly in the AWS CloudFormation User Guide.

Cleaning up

After the stack is created, you can update it by using the same template and providing new values for the parameters. For more information, see Updating stacks directly in the AWS CloudFormation User Guide.

Conclusion

In this blog post, we walked you through setting up a solution to monitor and automate expansion of an Amazon FSx for OpenZFS filesystem using Amazon CloudWatch. We also showed you how to see if your file system has reached a threshold of storage utilization. When the threshold was reached, we showed you how to automatically increase the file system’s storage capacity by a defined percentage using an AWS Lambda function. Similar concepts can be used to apply auto scaling of FSx for OpenZFS IOPS or Throughput. Benefits include automating the elastic expansion of your file system to ensure the performance and reliability of your workload as it grows to accommodate your business.

In addition to the solution proposed, if you happened to arrive at this post by searching for another FSx family member, there are solutions for auto-expansion of FSx for Windows File Server, as well as FSx for ONTAP file systems that you can find at the following links:

Ran Pergamin

Ran Pergamin

Ran Pergamin is a senior Solution architect specialist for Storage in the EMEA team. He likes helping customers make good decisions when it comes to picking up the right storage platform for the workload.

Chuma Dyasi

Chuma Dyasi

Chuma Dyasi is an Enterprise On-Ramp Technical Account Manager in the EMEA Team. He spends his days helping his customers achieve operational excellence while operating in the cloud