AWS Storage Blog

Migrating network file shares to Amazon WorkDocs using AWS DataSync

Today, many AWS customers use Amazon WorkDocs to retire expensive network file shares and move content to the cloud. With WorkDocs’s pay-as-you-go pricing, customers only pay for the active user accounts on their WorkDocs site. WorkDocs not only provides secure cloud storage, but also allows users to easily share content with other internal and external users. Additionally, Amazon WorkDocs Drive enables users to launch content directly from Windows File Explorer, Mac Finder, or Amazon WorkSpaces without consuming local disk space. This enablement minimizes user friction and shortens the learning curve to adopt cloud-based file shares.

Customers moving from on-premises Network File Systems (NFS) or Server Message Block (SMB) file shares to WorkDocs commonly use AWS DataSync and Amazon S3 to enable this migration. DataSync is an online data transfer service that is designed to quickly and efficiently move large amounts of data from NFS or SMB servers to AWS Cloud storage services. For example, once the data is in S3, Amazon WorkDocs Migration Service is used to migrate the content from an Amazon S3 bucket to a designated WorkDocs site for users.

This blog post walks you through the recommended steps for migrating from on-premises NFS or SMB file shares to WorkDocs. The process consists of following these steps:

Step 1: Transferring data from on-premises to Amazon S3

  1. Deploy and activate a DataSync agent
  2. Configure NFS or SMB source location
  3. Configure a destination location for Amazon S3
  4. Create and configure task settings
  5. Start the transfer

Step 2: Transferring data from Amazon S3 to Amazon WorkDocs

  1. Preparing for migration
  2. Scheduling a migration
  3. Tracking a migration

Step 1: Transferring data from on-premises to Amazon S3

Use the DataSync Management Console to choose the AWS Region where you want to run DataSync. The AWS Region should be the one where you plan to locate your Amazon S3 bucket.

Step 1.1: Deploy and activate a DataSync agent

To access your on-premises storage, first deploy and activate a DataSync agent. The activation process associates your agent with your AWS account.

After you have deployed an agent, choose a service endpoint. If you use public endpoints, all communication from your DataSync agent to AWS occurs over the public internet. If you use a VPC endpoint, all communication from DataSync to other AWS services occurs through the VPC endpoint in your VPC in AWS. Note, all data transferred between the source and destination is encrypted via Transport Layer Security (TLS).

To activate your agent, first get the IP address of your agent and use it to get an activation key. The activation key securely associates the agent with your AWS account. The activation process requires the agent’s port 80 to be accessible from your browser.

Step 1.2: Configure an NFS or SMB source location

A task consists of a pair of locations which data is between. The source location defines the storage system or service that you want to copy data from. By navigating to the Locations page on the AWS DataSync console, you can create an NFS or SMB source location.

Step 1.3: Configure a destination location for Amazon S3

The destination location defines the storage system or service that you want to write data to, in this case Amazon S3. By navigating to the Locations page on the DataSync console, you can create an Amazon S3 destination location.

Step 1.4: Create and configure task settings

Create a task by specifying the location of your data source and destination, as well as any options you want to use to configure the transfer. Now that you have an agent and have configured source and destination locations, you can configure settings for your task. Depending on whether you’re migrating your entire environment at once, or planning to regularly transfer changed data until the migration cutoff, you might want to schedule regular executions of your task. In this case, set each execution to verify only transferred files. To make sure DataSync can send logs to Amazon CloudWatch, which enables you to monitor task execution, specify a Log Group and attach a resource policy with the required permissions. A task is a set of two locations (source and destination) and a set of options that you use to control the behavior of a task.

Step 1.5: Start the transfer

Start the task to initiate the data transfer. You can monitor the task execution in the DataSync console, which displays CloudWatch graphs that track discovered and transferred files. Logs about the task execution are emitted to CloudWatch Logs. In addition, you can use CloudWatch Events to get notified upon task completion.

Step 2: Transferring data from Amazon S3 to Amazon WorkDocs

With data in Amazon S3, WorkDocs administrators can use the WorkDocs Migration Service to perform a large-scale migration of multiple files and folders to their Amazon WorkDocs site. The WorkDocs Migration Service works with Amazon S3.

During the migration process, WorkDocs provides an AWS Identity and Access Management (IAM) policy for you. Use this policy to create a new IAM role that grants access to the WorkDocs Migration Service to do the following:

  • Read and list the Amazon S3 bucket that you designate
  • Read and write to the Amazon WorkDocs site that you designate

Before you begin, confirm that you have the following permissions:

  • Administrator permissions for your WorkDocs site
  • Permissions to create an IAM role

Complete the following tasks to migrate your files and folders to WorkDocs. Note that the directory structure, file names, and file content are preserved when migrating to Amazon WorkDocs, whereas File ownership and permissions are not preserved.

Step 2.1: Preparing for migration

On your Amazon WorkDocs site, under My Documents, create a folder that you want to migrate your files and folders to, as shown in the following two screenshots. Confirm that the files to be migrated are less than 5 TB each, as that is the size restriction on files to be migrated.

On your Amazon WorkDocs site, under My Documents, create a folder that you want to migrate your files and folders to

Creating a folder on WorkDocs

Before proceeding with migration scheduling, ensure that Step 1 – Transferring data from on-premises to Amazon S3 – is completed successfully.

Step 2.2: Scheduling a migration

After you complete step 1, use the WorkDocs Migration Service to schedule the migration. When you schedule the migration, your WorkDocs user account Storage setting is automatically changed to Unlimited.

The first thing you must do to schedule a migration is to go to the WorkDocs console. Then you must select Apps and then select Migrations, as shown in the following screenshot:

The first thing you must do to schedule a migration is to go to the WorkDocs console. Then you must select Apps and then select Migrations

Next, after you launch the Migrations app, subscribe to Amazon Simple Notification Service email notification.

Next, after you launch the Migrations app, subscribe to Amazon Simple Notification Service email notification.

In your email notification, select Confirm subscription. The following screenshot shows the subscription confirmation:

Screenshot of subscription confirmation of email notifications

From the Amazon Migration Service console, choose Create Migration:

From the Amazon Migration Service console, choose Create Migration

For Source Type, select Amazon S3:

For Source Type, select Amazon S3

For Data Source & Validation, create and provide Role ARN from the IAM role and select your Amazon S3 bucket:

For Data Source & Validation, create and provide Role ARN from the IAM role and select your Amazon S3 bucket

For Destination WorkDocs Folder, select the destination folder in Amazon WorkDocs to migrate the files to:

For Destination WorkDocs Folder, select the destination folder in Amazon WorkDocs to migrate the files to

In the Review tab, enter a Title for the migration:

In the Review tab, enter a Title for the migration

Step 2.3: Tracking a migration

You can track your migration from within the WorkDocs Migration Service landing page. To access the landing page from the WorkDocs site, choose Apps, and Migrations. Choose your migration to view its details and track its progress. You can also choose Cancel Migration if you must cancel it, or choose Update to update the timeline for the migration. After a migration is complete, you can choose Download report to download a log of the successfully migrated files, along with getting visibility into any failures or errors.

WorkDocs Migration service provides six different states of migrations, two of which are depicted in the following screenshot. These include:

  • Scheduled – migration scheduled but not started
  • Migrating – indicating migration is in progress
  • Success – indicating migration completed successfully
  • Partial Success – migration is partially complete and view migration summary for more details
  • Failed – indicating migration failed and view migration summary for more details
  • Canceled – when the migration is canceled

Two different states of migration

Cleaning up

If you are not planning any additional migration activities, then as a best practice, we recommend deleting the migration policy and role that you created from the IAM console. When a scheduled migration starts, your WorkDocs user account Storage setting is automatically changed to Unlimited. After the migration, you can change your Storage settings by editing your user account from the admin control panel.

Summary

In this blog post, we explained how you can migrate your on-premises NFS or SMB file shares to Amazon WorkDocs using Amazon WorkDocs Migration Service and AWS DataSync. Amazon WorkDocs provides you a fully managed, secure content creation, storage, and collaboration service. With Amazon WorkDocs, you can easily create, edit, and share content, and because it’s stored centrally on AWS, you can access it from anywhere on any device. AWS DataSync offers you an online data transfer service designed to simplify, automate, and accelerate copying large amounts of data to and from AWS storage services. Look at the features pages for Amazon WorkDocs and AWS DataSync to learn more.

The legacy network file shares and on-premises enterprise content management (ECM) solutions are expensive, complex, and monolithic. As you look at the ways to adopt modern and efficient cloud alternatives, consider Amazon WorkDocs and AWS DataSync to migrate existing content from legacy network file shares. Your users can continue to access all their individual and team’s shared content using native desktop file systems, or web or mobile interface.

AWS is working on many new features, and as always, we’d love to hear your feedback and ideas. Contact us through the Amazon WorkDocs forum, or leave your feedback and questions in this blog post’s comments section!

Narendra Dixit

Narendra Dixit

Narendra Dixit is a Senior Product Manager at AWS. Previously, Narendra was leading a central product management team at Oracle driving joint product strategy for SaaS and PaaS products. Earlier, Narendra has spent years in enterprise applications consulting, solving customer problems across various industry verticals.

Harshit Shah

Harshit Shah

Harshit is a Senior Manager for AWS based in Seattle, Washington. In his current role, he leads engineering for the Amazon WorkDocs service team as they build customer centric solutions focusing on business productivity and end-user computing. Harshit has over 20 years of experience building Enterprise products. Outside of work, Harshit enjoys the outdoors activities that the Seattle area has to offer, including mountaineering and cycling.