Skip to main content

Guidance for Automating Data Repository Task Schedules with Amazon FSx for Lustre

Overview

This Guidance demonstrates how to automate data synchronization between Amazon FSx for Lustre and Amazon Simple Storage Service (Amazon S3) data repositories without auto-import or auto-export functionality. It offers a streamlined approach for organizations managing large-scale data operations by scheduling tasks based on business requirements. The Guidance leverages serverless architecture to deliver cost-effective, scheduled data exports while minimizing operational overhead. Using native AWS services, it enables automated resource management, comprehensive monitoring, and real-time notifications to ensure data consistency and timely issue resolution across your global operations.

How it works

These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.

Deploy with confidence

Ready to deploy? Review the sample code on GitHub for detailed deployment instructions to deploy as-is or customize to fit your needs. 

Go to sample code

Well-Architected Pillars

The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.

Through serverless computing, Lambda removes your infrastructure management overhead and provides automatic scaling and fault tolerance. X-Ray offers complete request tracing, helping you identify bottlenecks and troubleshoot issues. CloudWatch offers monitoring, alerting, and logging capabilities for observation of health and performance.

Read the Operational Excellence whitepaper 

IAM enables you to create and manage fine-grained access control to protect AWS resources. CloudWatch provides real-time monitoring and alerts for potential security issues, and X-Ray lets you trace application behavior to identify and resolve security concerns. Combined into a comprehensive framework, these services enforce the principle of least privilege, enable continuous monitoring, maintain audit trails, and secure communication between AWS services.

Read the Security whitepaper 

FSx for Lustre replicates data across multiple storage servers, providing data persistence even if one or more servers fail. Amazon S3 stores data across AWS Regions with up to 99.999999999% (11 nines) of data durability. Lambda provides serverless implementation of applications, removing the need for you to manage server infrastructure. Together, these services maintain data consistency and reliable operations while reducing the risk of data loss and system failures.

Read the Reliability whitepaper 

FSx for Lustre provides a high-throughput and low-latency file system access. Lambda offers scalable and rapid task implementation without the need for infrastructure management. And Amazon S3 delivers scalable and consistent object storage performance for DRTs, enabling efficient data movement and minimizing operational overhead. Working together, these services create a performance-optimized architecture that enables automated bulk operations between storage systems through scheduled tasks, all while maintaining efficiency at scale.

Read the Performance Efficiency whitepaper 

CloudWatch removes the need for you to provision and pay for custom monitoring tools. Lambda is a serverless service that scales to match demand, removing the cost of maintaining always-on servers for task implementation. Additionally, FSx for Lustre scales file system resources based on actual workload requirements rather than peak capacity, so you only pay for the storage you use.

Read the Cost Optimization whitepaper 

Lambda dynamically scales with demand without the need for you to provision infrastructure, enabling you to avoid overprovisioning. And with no infrastructure for you to manage, Amazon S3 helps minimize idle storage resources. Additionally, by invoking events only when needed, CloudWatch reduces the energy consumption for automating tasks. Working together, the services in this Guidance create resource optimization through a serverless architecture and efficient storage management, which leads to more sustainable operations.

Read the Sustainability whitepaper 

Disclaimer

The sample code; software libraries; command line tools; proofs of concept; templates; or other related technology (including any of the foregoing that are provided by our personnel) is provided to you as AWS Content under the AWS Customer Agreement, or the relevant written agreement between you and AWS (whichever applies). You should not use this AWS Content in your production accounts, or on production or other critical data. You are responsible for testing, securing, and optimizing the AWS Content, such as sample code, as appropriate for production grade use based on your specific quality control practices and standards. Deploying AWS Content may incur AWS charges for creating or using AWS chargeable resources, such as running Amazon EC2 instances or using Amazon S3 storage.