Amazon FSx for Lustre FAQs

General

Amazon FSx for Lustre makes it easy and cost effective to launch, run, and scale the world’s most popular high-performance file system.

The open source Lustre file system is designed for applications that require fast storage – where you want your storage to keep up with your compute. Lustre was built to solve the problem of quickly and cheaply processing the world’s ever-growing data sets, and it’s the most widely used file system for the 500 fastest computers in the world.

As a fully managed service, Amazon FSx brings Lustre to the masses, allowing you to use it for any workload where storage speed matters. Amazon FSx eliminates the traditional complexity of setting up and managing high-performance Lustre file systems, allowing you in minutes to spin up, run, and scale a battle-tested high-performance file system. It also provides multiple deployment options so you can optimize cost for your needs.

Amazon FSx also integrates with Amazon S3 , making it easy for you to process cloud data sets with the Lustre high-performance file system. When linked to an S3 bucket, an FSx for Lustre file system transparently presents S3 objects as files and can automatically update the contents of the linked S3 bucket as files are added to, changed in, or deleted from the file system.

Use Amazon FSx for Lustre for workloads where speed matters, such as machine learning, high performance computing (HPC) , video processing, financial modeling, genome sequencing, and electronic design automation (EDA).

To use Amazon FSx for Lustre, you must have an AWS account. If you do not have one, sign up on Sign up for AWS .

With an AWS account, you can easily create a file system from the AWS Management Console, the AWS Command Line Interface (AWS CLI), or the Amazon FSx API (and various language-specific SDKs). Within minutes, your file system is running and accessible to your compute instances. Learn more about getting started with FSx for Lustre.

Amazon FSx for Lustre provides two deployment options: scratch and persistent.

Scratch file systems are designed for temporary storage and shorter-term processing of data. Data is not replicated and does not persist if a file server fails.

Persistent file systems are designed for longer-term storage and workloads. The file servers are highly available, and data is automatically replicated within the AWS Availability Zone (AZ) that is associated with the file system. The data volumes attached to the file servers are replicated independently from the file servers to which they are attached.

Choose SSD storage for latency-sensitive workloads or workloads requiring the highest levels of IOPS/throughput. Choose HDD storage for throughput-focused workloads that aren’t latency-sensitive. For HDD-based file systems, the optional SSD cache improves performance by automatically placing your most frequently read data on SSD (the cache size is 20% of your file system size).

FSx for Lustre is compatible with the most popular Linux-based AMIs, including Amazon Linux, Amazon Linux 2, Red Hat Enterprise Linux (RHEL), CentOS, SUSE Linux and Ubuntu. FSx for Lustre is also compatible with both x86-based EC2 instances and Arm-based EC2 instances powered by the AWS Graviton2 processor. With FSx for Lustre, you can mix and match the instance types and Linux AMIs that are connected to a single file system.

To access your file system from a Linux instance, you first install the open-source Lustre client on that instance. Once it’s installed, you can mount your file system using standard Linux commands. Once mounted, you can work with the files and directories in your file system just like you would with a local file system.

The Lustre client is included with Amazon Linux 2 and Amazon Linux. For Red Hat Enterprise Linux, CentOS, and Ubuntu, an AWS repository for the Lustre client is supported that provides clients compatible with these operating systems. See the FSx for Lustre documentation for details.

You can use persistent storage volumes backed by FSx for Lustre using the FSx for Lustre CSI driver from Amazon EKS or your self-managed Kubernetes on AWS. See the Amazon EKS documentation for details.

Amazon FSx is a fully managed service, so all of the file storage infrastructure is managed for you. When you use Amazon FSx, you avoid the complexity of deploying and maintaining complex file system infrastructure.

You can administer a file system via the AWS Management Console , the AWS command-line interface (CLI), or the Amazon FSx API (and various language-specific SDKs). The Console, API, and SDK provide the ability to create, scale, and delete file systems; create and edit file system tags, and display detailed information about file systems. 

You can set and enforce storage limits based on the number of files or storage capacity consumed by a specific user, group or project. You can choose to set a hard limit that denies users, groups or projects from consuming additional storage after exceeding their quota, or set a soft limit that provides users with a grace period to complete their workloads before converting into a hard limit. To simplify file system administration, you can also monitor user-, group- and project-level storage usage on FSx for Lustre file systems. To learn more, visit the FSx for Lustre Storage Quotas documentation .

You can enable data compression on your file system by clicking Update in the Amazon FSx Console, or by calling “UpdateFileSystem” in the AWS CLI/API and specifying “LZ4” as the Data Compression Type. Once the feature is enabled, all newly-written files will be automatically compressed on FSx for Lustre before they are written to disk and uncompressed when they are read. Since the LZ4 data compression algorithm is lossless, the original data can be fully reconstructed from the compressed data. Files loaded on the file system prior to enabling the data compression feature can also be compressed using the “lfs_migrate” command.

You can use CloudWatch Metrics to see the total logical disk usage (without compression) and total physical disk usage (with compression) of your file system. See Amazon FSx for Lustre Data Compression documentation for additional information.

The data compression feature on FSx for Lustre uses the LZ4 compression algorithm. Since the LZ4 compression algorithm is optimized for compression speed, enabling data compression will not adversely impact file system performance.

To provide fast reads and writes from RAM cache, FSx for Lustre file servers are equipped with higher levels of network bandwidth on the front-end network interface cards (NICs) than is available between the file servers and storage disks. Since data compression reduces the amount of data sent between file servers and storage disks, you will see an increase in overall file system throughput capacity when using data compression. Increases in throughput capacity related to data compression will be capped once you saturate the front-end NIC of your file system. See FSx for Lustre documentation for more details on throughput performance when using data compression.

Amazon FSx for Lustre provides native CloudWatch integration, allowing you to monitor file system health and performance metrics in real time. Example metrics include storage consumed, number of compute instance connections, throughput, and number of file operations per second. You can log all Amazon FSx API calls using AWS CloudTrail.

Amazon FSx for Lustre can be an input data source for Amazon SageMaker. When you use FSx for Lustre as an input data source, Amazon SageMaker ML training jobs can be accelerated by eliminating the initial S3 download step. SageMaker jobs are started as soon as the FSx for Lustre file system is linked with the S3 bucket without needing to download the full machine learning training dataset from S3. Data is lazy loaded as needed from Amazon S3 for processing jobs. FSx for Lustre can also help you reduce total cost of ownership (TCO) by avoiding the repeated download of common objects (so you can save S3 request costs) for iterative jobs on the same dataset.

Amazon FSx for Lustre integrates with AWS Batch through EC2 Launch Templates. AWS Batch is a cloud-native batch scheduler for HPC, ML, and other asynchronous workloads. AWS Batch will automatically and dynamically size instances to job resource requirements, and use existing FSx for Lustre file systems when launching instances and running jobs.

AWS ParallelCluster is an AWS-supported open-source cluster management tool that helps you to deploy and manage High Performance Computing (HPC) clusters on AWS. AWS ParallelCluster supports automatic creation of a new Amazon FSx for Lustre file system or the ability to use an existing Amazon FSx for Lustre file system as part of the cluster creation process.

Please refer to Regional Products and Services for details of Amazon FSx for Lustre service availability by region.

If you have high-performance or data processing workloads running on-premises and demand for computing capacity spikes, you can cloud burst your workloads to Amazon FSx for Lustre by using Amazon Direct Connect or VPN.

S3 integration

You can link your Amazon FSx for Lustre file system to your Amazon S3 buckets, and FSx for Lustre makes your S3 data transparently accessible in your file system.

You can link your file system to S3 buckets by creating data repository associations using the Amazon FSx console, the AWS CLI, or the Amazon FSx API. The names of objects in your S3 buckets will appear as file and directory listings on the file system. The actual content of a given object is imported automatically from S3 only when you access the associated file on the file system for the first time—meaning an object’s data doesn’t consume space on your file system unless it’s accessed at least once on the file system.

When linked to an S3 bucket, FSx for Lustre can update the file system's contents automatically as objects are added to, changed in, or deleted from your S3 bucket. FSx for Lustre can also automatically update the contents of the linked S3 bucket as files are added to, changed in, or deleted from the file system.

Amazon FSx for Lustre uses parallel data transfer techniques to transfer data to and from S3 at up to hundreds of GB/s

You can configure FSx for Lustre to keep content synchronized in both directions between the file system and the linked S3 buckets. As you make changes to objects in your S3 bucket, FSx for Lustre automatically detects and imports the changes to your file system. As you make changes to files in your file system, FSx for Lustre automatically exports the changes to your S3 bucket.

For both automatic import and automatic export, you can choose to import or export additions, changes, and deletions of files. You can set these preferences when you link an S3 bucket or at a later time by updating your data repository associations using the AWS CLI, the Amazon FSx API, or the Amazon FSx console. For more information about monitoring both automatic import and export, please see Monitoring with Amazon CloudWatch .

Alternatively, instead of using automatic import and automatic export, you also have the option to import and export batches of new and changed data between the file system and S3 for fine-grained control over data synchronization. You can run an Import or Export task using the AWS CLI, API, or console.

You can modify files in either S3 or the file system and each will persist updates in the order they receive them. If you modify the same file in both the S3 bucket and the file system, you should coordinate updates at the application level to prevent conflicts. FSx for Lustre will not prevent conflicting writes in multiple locations.

Learn more about using data repositories .

You can release inactive data from S3-linked file systems by using a data repository task. When you create a data repository task to release file data, you can specify the paths of directories or files you would like to release and optionally specify a minimum amount of time since last access for a file to be eligible for release.

Learn more about releasing file system data .

No. Releasing file system data will not ensure your file system will never become full. Releasing file system data optimizes your available storage by enabling you to remove inactive data that has been exported to a linked Amazon S3 bucket.

Scale and performance

Amazon FSx for Lustre file systems scale to TB/s of throughput and millions of IOPS. FSx for Lustre also supports concurrent access to the same file or directory from thousands of compute instances. FSx for Lustre provides consistent, sub-millisecond latencies for file operations.

See Amazon FSx Performance documentation for more details.

FSx for Lustre file systems automatically provision throughput for each TiB of storage provisioned. SSD-based file systems can be provisioned with 125, 250, 500, or 1,000 MB/s of throughput per TiB of storage provisioned. HDD-based file systems can be provisioned with 12 or 40 MB/s of throughput per TiB of storage provisioned.

You can increase the storage capacity of your file system by clicking “Update" in the Amazon FSx Console, or by calling “UpdateFileSystem” in the AWS CLI/API and specifying the desired storage capacity.

You can choose to provision metadata IOPS using “Automatic” mode or “User-Provisioned” mode. In automatic mode, Amazon FSx automatically provisions metadata IOPS based on the storage capacity of your file system. In user-provisioned mode, you can specify the number of metadata IOPS to provision.

Amazon FSx for Lustre stores data and metadata across multiple network file servers, each with its own storage. When you request an update to your file system’s storage capacity, Amazon FSx automatically adds new network file servers and scales your metadata server. While scaling storage capacity, the file system may be unavailable for a few minutes. Client requests sent while the file system is unavailable will transparently retry and eventually succeed after scaling is complete.

After scaling, FSx transparently optimizes your file system by redistributing your data across the old and newly added file servers. This optimization process runs in the background, and can take between a few hours to a few days depending on the amount of data stored on your file system. The background optimization process has minimal impact on your workload performance. You can track the progress of the optimization process at any time using the Amazon FSx Console or AWS CLI/API. See the Managing Storage Capacity documentation for more information.

You should change your file system’s throughput tier if you want to adjust the read/write performance of your file system but do not plan to add data to your file system. You should change your file system’s storage capacity if you want to add data to your file system, or if your file system is at the highest supported throughput tier (e.g. 1000 MB/s) and you want to improve read/write performance.

Amazon FSx for Lustre stores metadata across one or more metadata servers. When you request an update to your file system’s metadata IOPS, Amazon FSx automatically adds a new metadata server for each 12,000 metadata IOPS provisioned. While increasing Metadata IOPS, the file system may be unavailable for a few minutes. Client requests sent while the file system is unavailable will transparently retry and eventually succeed after scaling is complete.

Amazon FSx changes the throughput tier of your file system by switching out the file servers</p>

powering your file system to meet the requested throughput configuration. While scaling throughput on file systems smaller than 100 TiB, the file system will be unavailable for a few minutes. For file systems larger than 100 TiB, the file system may be unavailable for up to an hour. Client requests sent while the file system is unavailable will transparently retry and eventually succeed after scaling is complete.

In some cases, after scaling, FSx optimizes your file system, at which point your file system’s network I/O, CPU, and memory resources correspond with your new throughput level. This optimization process can take between a few hours to a few days. During this time, your new disk I/O performance level is available for write operations, and your read operations have disk I/O performance between the previous level and the new level. You can track the optimization process at any time using the Amazon FSx Console or AWS CLI/API. See the Managing storage and throughput capacity documentation for more information.

You can increase your file system’s storage capacity every six hours, and in the same increments that you can provision new file systems. Note that your previous scaling request, including optimization, must be complete when you issue a new scaling request.

An FSx for Lustre file system can be concurrently accessed by thousands of compute instances.

Scratch and persistent SSD-based file systems can be created in sizes of 1.2 TiB or in increments of 2.4 TiB. Persistent HDD-based file systems with 12 MB/s and 40 MB/s of throughput per TiB can be created in increments of 6.0 TiB and 1.8 TiB, respectively.

There is a 100-file system limit per account, which can be increased upon request.

Security and compliance

Yes. Amazon FSx for Lustre always encrypts your file system data and your backups at-rest using keys you manage through AWS Key Management Service (KMS). Amazon FSx encrypts data-in-transit when accessed from supported EC2 instances . See the Amazon FSx documentation for details on regions where in-transit encryption is supported.

Every FSx for Lustre resource is owned by an AWS account, and permissions to create or access a resource are governed by permissions policies. You specify the Amazon Virtual Private Cloud (VPC) in which your file system is made accessible, and you control which resources within the VPC have access to your file system using VPC Security Groups. You control who can administer your file system and backup resources (create, delete, etc.) using AWS IAM.

Yes, with Amazon FSx, you can create and use file systems in shared Amazon Virtual Private Clouds (VPCs) from both owner accounts and participant accounts with which the VPC has been shared. VPC sharing enables you to reduce the number of VPCs that you need to create and manage, while you still benefit from using separate accounts for billing and access control.

AWS has the longest-running compliance program in the cloud and is committed to helping customers navigate their requirements. Amazon FSx has been assessed to meet global and industry security standards. It complies with PCI DSS , ISO 9001 , 27001 , 27017 , and 27018 , and SOC 1, 2, and 3 , in addition to being HIPAA eligible . That makes it easier for you to verify our security and meet your own obligations. For more information and resources, visit our compliance pages . You can also go to the Services in Scope by Compliance Program page to see a full list of services and certifications.

Availability and durability

Use scratch file systems when you need cost-optimized storage for short-term, processing-heavy workloads.

Use persistent file systems for workloads that run for extended periods or indefinitely, and may be sensitive to disruptions in availability.

Yes. The Amazon FSx SLA provides for a service credit if a customer's monthly uptime percentage is below our service commitment in any billing cycle. 

Amazon FSx for Lustre provides a parallel file system. In parallel file systems, data is stored across multiple network file servers to maximize performance and reduce bottlenecks, and each server has multiple disks. Larger file systems have more file servers and disks than smaller file systems.

On a persistent file system, if a file server becomes unavailable it is replaced automatically and within minutes. In the meantime, client requests for data on that server transparently retry and eventually succeed after the file server is replaced. With persistent file systems, data is replicated on disks and any failed disks are automatically replaced behind the scenes, transparently.

On a scratch file system, file servers are not replaced if they fail and data is not replicated. If a file server or a storage disk becomes unavailable, files stored on other servers are still accessible. If clients try to access files that are on the unavailable server, they will get an I/O error. The following table provides the availability/durability for which scratch file systems of example sizes are designed. As larger file systems have more file servers and more disks, the probabilities of failure are increased.

Table: Availability/durability of scratch file systems of various example sizes 

Scratch file system size (TiB)Number of file serversAvailability/durability during one dayAvailability/durability during one week1.2299.9%99.4%2.4299.9%99.4%4.8399.8%99.2%9.6599.8%98.6%50.42299.1%93.9%Please refer to the Amazon FSx for Lustre documentation for more information.

Data protection

Amazon FSx takes daily automatic backups of your file systems, and allows you to take additional backups at any point. Amazon FSx backups are incremental, which means that only the changes after your most recent backup are saved, thus saving on backup storage costs by not duplicating data.

Alternatively, you can use data repository associations to keep the data in your FSx for Lustre file system synchronized with S3 buckets or prefixes. FSx will not take automatic backups of the file system if it is linked to S3.

Backups are highly durable and file-system-consistent. To ensure high durability, Amazon FSx stores backups with 99.999999999% (11 9's) of durability on Amazon S3. Backups also present a consistent view of your file system, meaning that if metadata exists for a file in the backup, then the file’s associated data is also included in the backup.

The daily backup window is a 30-minute window that you specify when creating a file system. Amazon FSx takes the daily automatic backup of your file system during this window. At some point during the daily backup window, storage I/O will be briefly suspended while the backup process initializes (typically a few seconds).

The daily backup retention period specified for your file system (7 days by default) determines the number of days your daily automatic backups are kept.

When you delete your file system, all automatic daily backups associated with the file system are deleted. Any user-initiated backups you created will remain.

You can take a backup of any FSx for Lustre file system that has persistent storage and is a standalone file system (i.e., not linked to an Amazon S3 bucket).

Backups aren’t supported on scratch file systems because these file systems are designed for temporary storage and shorter-term processing of data.

Backups aren’t supported on file systems linked to an Amazon S3 bucket because in this case the S3 bucket serves as the primary storage location for your full dataset—the file system does not necessarily contain the full dataset at any given time. With these file systems, FSx for Lustre can automatically export new, changed, or deleted files from the file system to your S3 bucket.

You first enable Amazon FSx as a protected service in AWS Backup. You can then configure backups of your Amazon FSx resources via the AWS Backup console, API or CLI. You can create both scheduled and on-demand backups of Amazon FSx resources via AWS Backup and restore these backups as new Amazon FSx file systems. Amazon FSx file systems can be added to backup plans in the same way as other AWS resources, either by specifying the ARN or by tagging the Amazon FSx file system for protection in the backup plan. Learn more in the AWS Backup documentation .

You can configure your backup plans on AWS Backup to periodically create and copy backups of your Amazon FSx file systems to other AWS Regions, other AWS accounts, or both, with your desired frequency and retention policy. For cross-account backup copies, you use your AWS Organizations management account to designate source and destination accounts.

Pricing and billing

You pay only for the resources you use. See the Amazon FSx for Lustre pricing page for details.

Storage capacity scaling requests are processed by adding new storage capacity to your file system. You will be billed for new storage capacity once the new file servers have been added to your file system, and the file system status changes from UPDATING to AVAILABLE.

Throughput scaling requests are processed by replacing the file servers powering your file system. You will be billed for your new throughput tier once the file servers have been replaced, and the file system status changes from UPDATING to AVAILABLE. 

Except as otherwise noted, our prices are exclusive of applicable taxes and duties, including VAT and applicable sales tax. For customers with a Japanese billing address, use of AWS services is subject to Japanese Consumption Tax. Learn more.