Create a cross-platform distributed file system with Amazon FSx for NetApp ONTAP
Due to need to control costs in the face of exponential data growth, doing more with existing on-premise resources while minimizing their growth has become ever more important. Most organizations would like to enjoy the benefits of the cloud while leveraging their existing on-premise file assets to create a highly resilient hybrid enterprise file share. This can bring the challenging requirement of having different storage systems that exist at the edge, in the data center, and AWS to work together seamlessly, even in the face of disaster events.
In this post, I explore how Peer Software’s Global File Service (“PeerGFS”) allows customers to access files from the edge, data centers, and AWS through the use of cross-platform file replication, synchronization, and caching technologies. I also explore how to prevent version conflicts across active-active storage systems through integrated distributed file locking. I utilize Amazon FSx for NetApp ONTAP as the repository of record in AWS and Windows SMB, NetApp, Nutanix, or Dell storage for the on-premises edge and data center storage. The use cases we will cover consists of:
- File caching from FSx for ONTAP to on-premises edge Windows file storage
- Distributed file system between on premises, edge, and data center storage with FSx for ONTAP
- Continuous data protection and high availability from on-premises storage to FSx for ONTAP
- Migration from on-premises storage to FSx for ONTAP
This solution will allow users to access files quickly in edge or data center locations, while leveraging the full benefits of cloud native file storage, including advanced data protection, high availability, and disaster recovery options.
Enabling hybrid cloud SMB file caching from FSx for ONTAP to Windows file servers
FSx for ONTAP is a storage service that lets you launch and run fully managed NetApp ONTAP file systems in the AWS Cloud at petabyte scale. It provides the familiar features, performance, capabilities, and APIs of NetApp file systems, along with the agility, scalability, and simplicity of a fully managed AWS service.
PeerGFS optimizes on-premises access to FSx for ONTAP SMB file shares by making it easier for users to access FSx for ONTAp files with lower latency through creating a local cache on a standard Windows file server. This enables faster performance and reduced data transfer traffic. File system operations, such as reading and writing files, are all performed against the local cache, while PeerGFS synchronizes changed data to FSx for ONTAP via asynchronous near real-time replication. Simultaneous editing of the same file in multiple locations is prevented via integrated distributed file locking. A single global namespace is provided through incorporation and control of Microsoft DFS Namespace. This also enables automatic failover and failback across sites in the event of site outages.
With these capabilities, you can consolidate all your on-premises file share data in FSx for ONTAP and benefit from the protected, resilient, fully managed FSx for ONTAP file system.
Figure 1: A typical configuration of PeerGFS file caching from FSx for ONTAP to two on-premises Windows file servers
The deployment of this configuration can be accomplished in the following six steps:
- Deploy Amazon FSx for NetApp ONTAP.
- Deploy PeerGFS through the AWS Marketplace.
- Deploy Peer Agent for FSx for ONTAP.
- Deploy one or more edge servers (in AWS and/or on premises).
- Install Peer Agent.
- Create a file collaboration relationship with edge caching.
Step 1: Deploy FSx for ONTAP
PeerGFS requires the ability to authenticate against a Microsoft AD environment to communicate with FSx for ONTAP. The AWS Directory Service can manage a Microsoft AD directory. Deployment details can be found here. As an alternative, you can also stretch an on-premises Active Directory environment into AWS or host Active Directory manually within an AWS Windows VM.
Step 2: Deploy PeerGFS through the AWS Marketplace
The primary management component of PeerGFS is the Peer Management Center (PMC). The PMC is available as an Amazon Machine Image for Amazon Elastic Computer Cloud (Amazon EC2). It can be deployed in your AWS account using this marketplace listing.
To deploy the PMC:
- Log in to your AWS account and navigate here.
- Select Continue to Subscribe.
- Select review the Peer Software EULA and Accept.
- Once the AWS Marketplace has enabled PeerGFS in your account, select Continue to Configuration to set up the PMC.
- For additional detailed configuration steps, follow the instructions in the full documentation.
Step 3: Deploy Peer Agent for FSx for ONTAP
Once FSx for ONTAP and the PMC are both deployed and have joined the domain, the Peer Agent can be deployed within AWS. The Peer Agent requires a Windows-based virtual machine (VM) in AWS to facilitate SMB replication between FSx for ONTAP and on-premises storage. This is accomplished through integration with NetApp’s FPolicy API for real-time file event monitoring.
To deploy a Windows VM in AWS with the Peer Agent, follow the instructions in the full documentation.
Step 4: Deploy one or more edge servers (in AWS and/or on premises)
For each edge location, deploy a Windows Server 2016 or newer system to run the Peer Agent software. Each of these servers represents edges in the file services fabric.
If you would like additional edge file servers in AWS, then you can follow the steps like those in Step 3.
Step 5: Install Peer Agent
Download the Peer Agent software. The link to the Agent installer can be found in the email containing your license file for PeerGFS.
Run the Peer Agent installer. Review and accept the license agreement and keep the default destination directory.
Step 6: Create a file collaboration relationship with edge caching
Support for edge caching is provided by PeerGFS’ Dynamic Storage Utilization (DSU) feature. For details on how to create a collaboration relationship, see section 6 of this Peer Knowledge Base article or this detailed article on DSU. When setting up, you can define the policies as shown here:
The Peer Agent tied to FSx for ONTAP should be defined as a master participant during the configuration process. This is because it hosts the complete data set (repository of record) for the environment.
Each Agent server at an edge location should be defined as an edge participant during the configuration process. These edge participants only receive a subset of the complete data set as a local file cache.
Enabling file caching and file collaboration that is transparent to the user across hybrid cloud and across platform environments is made simple, highly available, and reliable with the combination of FSx for ONTAP and PeerGFS.
To avoid incurring unwanted AWS costs after performing these steps, delete any AWS resources created like instances, FSx for ONTAP resources, Active Directory resources, and Peer Agent instances.
In this post, I showed how to create a cloud based, globally distributed file system that can leverage almost any on-premise file system platform to cache data locally, making it possible to access files from your data center, edge locations, and AWS. I also covered preventing version conflicts across disparate active-active storage systems through integrated distributed file locking.
With this solution, you can use almost any SMB capable NAS device to enable file caching, distributed file services, and continuous data protection and high availability, all backed by FSx for ONTAP. In addition, this same configuration can be used to migrate files from almost any on-premise SMB source to FSx for ONTAP.
Amazon FSx for NetApp ONTAP is available in most AWS Regions, and PeerGFS is available in the AWS Marketplace. If you have any comments or questions, don’t hesitate to leave them in the comments section.