How can I use DataSync to transfer data between two Amazon EFS file systems over a private network?

Last updated: 2021-10-22

I want to use AWS DataSync to transfer data between two Amazon Elastic File System (Amazon EFS) file systems over a private network.

Short description

You can use DataSync to transfer data between two Amazon EFS file systems. The data transfer between file systems falls into one of the following categories:

  • File systems in different accounts in the same Region.
  • File systems in different accounts in different Regions.
  • File systems in the same account and in different Regions.
  • File systems in the same account and in the same Region.

To transfer data using DataSync, you must deploy the DataSync agent that meets certain requirements. With this method, you can transfer the data using the customer managed private network.

To activate a DataSync transfer between Amazon EFS file systems for all the preceding use cases, do the following:

  1. Create a cross-account and cross-Region virtual private cloud (VPC) peering connection.
    Note: If your use case involves data transfer between file systems in the same account in the same Region, then skip this step.
  2. Configure the security group rules in the source and destination Amazon EFS file systems.
  3. Create a VPC endpoint for DataSync in the account and Region where the destination Amazon EFS file system exists.
  4. Deploy the DataSync agent in the account and Region where the source file system exists.
  5. Activate the DataSync agent in the account and Region where the destination file system exists.
  6. Create the source location with the type NFS in the account and Region where the source file system exists.
  7. Create the destination location with the type EFS in the account and Region where the destination file system exists.
  8. Create the DataSync task, and then run the task.

Note: If you have an AWS Transit Gateway instead of VPC peering, then you don't need to create the VPC peering connection.

Optionally, you can use the in-cloud transfer feature that's also known as the agent-less transfer for data transfer between file systems in the same account. This feature supports a fully automated transfer between AWS services. For a high-level architecture of this feature, see Data transfer between AWS storage services. With this approach, you do not need to deploy the DataSync agent. Instead, DataSync manages the DataSync agent automatically.

To activate a DataSync transfer between two Amazon EFS file systems in the same account either in the same Region or in different Regions:

  • Create the location for the source and destination file systems by selecting the appropriate Regions.
  • Select the appropriate subnet and security group when you create the Amazon EFS locations.
  • Create the DataSync task, and then run the task.

For more information, see Creating a location for Amazon EFS.

Note:

  • Agent-less transfer doesn't support cross-account scenarios currently.
  • With agent-less transfer, data is transferred using the AWS managed network. However, unlike a customer-managed network, the network might not be completely private.
  • The number of files per task for data transfer between AWS storage services is limited to 25 million.

Resolution

The following configuration steps are based on the following example environment and applicable for the use case that uses the VPC endpoint:

  • The source AWS account is 111111111111
  • The source AWS Region is US East (N. Virginia) (us-east-1).
  • The source VPC CIDR is 10.10.0.0/16 (one public subnet).
  • The DataSync agent virtual machine's (VM) IP address is 10.10.3.124. The DataSync VM is deployed in account and the Region where source Amazon EFS file system resides.
  • The destination AWS account is 222222222222.
  • The destination Region is US East (Ohio) (us-east-2).
  • The destination VPC CIDR is 10.20.0.0/16.

Important: You must configure your security group rules based on your environment's source and destination VPC CIDRs.

Create a cross-account and cross-Region VPC peering connection

Create a VPC peering connection between the VPCs of the source and destination accounts for the Amazon EFS file systems.

Before you proceed to the next step, use the Amazon VPC console to verify the following:

  • View the peering connection. Confirm that the status is Active.
  • View the source VPC. Review the VPC's route table to confirm that there is an active route to a target that begins with pcx. This route is for the peering connection.
  • View the destination VPC. Review the VPC's route table to confirm that there is an active route to a target that begins with pcx.

Configure the security group rules for the source and destination Amazon EFS file systems

Important: The following example security group rules are based on the example VPC CIDRs. You must configure your security group rules based on your environment's VPC CIDRs.

Configure the following:

  • One security group in the VPC and subnet of the account where the source Amazon EFS file system exists (example: Source_EFS_SG).
  • Two security groups in the VPC and subnet of the account where the destination Amazon EFS file system exists (example DS_Destination_Location_SG and Destination_EFS_SG).
  • One security group in the VPC and subnet of the account where the destination Amazon EFS file system exists. This security group is used to associate with the DataSync VPC endpoint (example: DS_VPCE_SG).

Configure the inbound and outbound rules for these four security groups similar to the following:

Source_EFS_SG:

Inbound:

Type Protocol Port Range Source Description
NFS TCP 2049 10.10.3.124/32 NFS

Outbound:

Type Protocol Port Range Source Description
All traffic All All 0.0.0.0/0 Default

DS_Destination_Location_SG:

Inbound:

Type Protocol Port Range Source Description
All traffic All All DS_Destination_Location_SG (Security group ID) DS_Destination_Location_SG

Outbound:

Type Protocol Port Range Source Description
All traffic All All 0.0.0.0/0 Default

Destination_EFS_SG:

Inbound:

Type Protocol Port Range Source Description
TCP NFS 2049 DS_Destination_Location_SG (Security group ID) Destination_EFS_SG
TCP HTTPS 443 10.10.3.124/32 Data_Transfer_From_Source

Outbound:

Type Protocol Port Range Source Description
All traffic All All 0.0.0.0/0 Default

DS_VPCE_SG:

Inbound:

Type Protocol Port Range Source Description
TCP HTTPS 443 10.10.3.124/32 Agent_Activation
TCP Custom TCP 1024-1064 10.10.3.124/32 Control_Traffic
TCP SSH 22 10.10.3.124/32 AWS_Support_Channel

Outbound:

Type Protocol Port Range Source Description
All traffic All All 0.0.0.0/0 Default

Create a VPC endpoint for DataSync in the Region of the destination Amazon EFS file system

Create and activate the DataSync agent

Note: The following steps are for creating an agent using the DataSync console. You can also create a DataSync agent using the AWS Command Line Interface (AWS CLI). Deploy/Install the DataSync agent in the account and Region of the source Amazon EFS file system using the AMI ID provided in Deploy your agent as an Amazon EC2 instance to access in-cloud file systems. Don't activate the agent yet.

  1. Open the DataSync console in the account and Region of the destination Amazon EFS file system.
  2. In the navigation pane, choose Agents.
  3. Choose Create agent.
  4. For Service endpoint, select VPC endpoints using AWS PrivateLink.
  5. For VPC Endpoint, select the VPC endpoint that you created in the destination Region.
  6. For Subnet, select the subnet that your VPC endpoint is in.
  7. For Security Group, select the security group of the VPC endpoint.
  8. Select Automatically get the activation key from your agent.
  9. For Agent address, enter the IP address of the DataSync agent Amazon EC2 instance.
    You can activate the DataSync agent using either its public IP address or private IP address. If you have only the private IP address, then you must activate the agent from a machine that's in the same subnet as the agent.
  10. Choose Get key.
  11. Activate the agent in the same Region as that of the destination Amazon EFS file system.

Create the locations for the source and destination Amazon EFS file systems

Create the source location:

  1. Open the DataSync console in the account and Region of the source Amazon EFS file system.
  2. From the navigation pane, choose Locations.
  3. Choose Create location.
  4. For Location type, select Network File System (NFS).
  5. For Agents, select the DataSync agent that you activated.
  6. For NFS Server, enter the source file system's mount target IP address.
  7. Choose Create location.

Create the destination location:

  1. Open the DataSync console in the account and Region of the destination Amazon EFS file system.
  2. In the navigation pane, choose Locations.
  3. Choose Create location.
  4. For Location type, select Amazon EFS file system.
  5. For EFS File system, select the destination file system.
  6. For Mount path, enter the mount path of the destination file system.
    Note: Be sure that the path, including the subfolder, exists. DataSync doesn't create the folder structure in the source and destination if the structure doesn't already exist. In this case, the task fails with the error No such file or directory.
  7. For Subnet, select the subnet where the destination file system exists.
  8. For Security Group, select the security group you previously created for the destination file system (for example DS_Destination_Location_SG).
  9. Choose Create location.

Note: Be sure that the destination file system's mount target has the security group that's similar to Destination_EFS_SG.

Create the DataSync task, and then run the task

Configure the task settings. After the task status shows as Available, you can run the task. The task then runs through multiple steps. For more information on each phase of the task, see Understanding task execution statuses.


Did this article help?


Do you need billing or technical support?