AWS DataSync Discovery helps you simplify migration planning and accelerate data migration to AWS by giving you visibility into your on-premises storage performance and utilization, and providing recommendations for migrating your data to AWS Storage services, such as Amazon FSx for NetApp ONTAP, Amazon FSx for Windows File Server, and Amazon Elastic File System (EFS). DataSync Discovery enables you to better understand your on-premises storage performance and capacity usage through automated data collection and analysis, enabling you to quickly identify data to be migrated and use generated recommendations to select AWS Storage services that align to your performance and capacity needs.

Data Movement

For online data transfers, AWS DataSync simplifies, automates, and accelerates copying large amounts of data between on-premises storage, edge locations, or other cloud providers, and AWS Storage services. DataSync can copy data to and from Network File System (NFS) shares, Server Message Block (SMB) shares, Hadoop Distributed File Systems (HDFS), self-managed object storage, Google Cloud Storage, Azure Files, Azure Blob Storage including Azure Data Lake Storage Gen2, Wasabi Cloud Storage, Oracle Cloud Storage, Cloudflare R2 Storage, DigitalOcean Spaces, Backblaze B2 Cloud Storage, AWS Snowcone, Amazon S3 compatible storage on Snow, Amazon Simple Storage Service (Amazon S3), Amazon Elastic File System (Amazon EFS) file systems, Amazon FSx for Windows File Server file systems, Amazon FSx for Lustre file systems, Amazon FSx for OpenZFS file systems, and Amazon FSx for NetApp ONTAP file systems.

Learn more:

AWS DataSync provides the following features for data movement.

Multicloud Data Movement

AWS DataSync helps you move data between AWS, on-premises file systems, and other cloud storage services. AWS has continued to extend its cloud services to help customers streamline, manage, and govern their hybrid and multicloud infrastructure and applications. For customers who operate in multicloud environments, AWS DataSync can now move data to and from storage on various clouds. In addition to support for Google Cloud Storage, Azure Files, and Azure Blob Storage, with DataSync, you can move your object data at-scale between S3-compatible storage on other clouds and AWS Storage services such as Amazon S3. This includes support for object storage on Wasabi Cloud, Oracle Cloud, Cloudflare, DigitalOcean Spaces, and Backblaze.

Purpose-Built Network Protocol

AWS DataSync employs an AWS-designed transfer protocol—decoupled from the storage protocol—to accelerate data movement. The protocol performs optimizations on how, when, and what data is sent over the network. Network optimizations performed by DataSync include incremental transfers, in-line compression, and sparse file detection, as well as in-line data validation and encryption.

Connections between the local DataSync agent and the in-cloud service components are multi-threaded, maximizing performance over your Wide Area Network (WAN). A single DataSync task is capable fully utilizing 10 Gbps over a network link between your on-premises environment and AWS.

Bandwidth Optimization and Control

Transferring hot or cold data should not impede your business. DataSync is equipped with granular controls to optimize bandwidth consumptions. Throttle transfer speeds up to 10 Gbps during off hours and set limits when network availability is needed elsewhere.

Data Transfer Scheduling

DataSync comes with a built-in scheduling mechanism, allowing you to periodically run data transfer tasks to detect and copy changes from your source storage system to the destination. You can schedule your tasks using the AWS DataSync Console or AWS Command Line Interface (CLI) without writing scripts to manage repeated transfers. Task scheduling automatically runs tasks on your configured schedule with hourly, daily, or weekly options provided directly in the AWS Console.

Data Encryption and Validation

All your data is encrypted in transit between the DataSync agent and the DataSync service using Transport Layer Security (TLS). DataSync supports using default at-rest encryption for Amazon S3 buckets. DataSync also supports encryption of data at rest and in transit for Amazon EFS and Amazon FSx.

DataSync ensures that your data arrives intact. For each transfer, the service performs integrity checks both in transit and at rest. These checks ensure that the data written to your destination matches the data read from your source, validating consistency.

File System Integration and Metadata Preservation

The DataSync agent connects to your existing storage systems using the industry-standard NFS and SMB protocols, to your Hadoop cluster as an HDFS client, to your self-managed object storage or Google Cloud Storage using the Amazon S3 application programming interface (API), or to Azure Blob Storage using the Blob API. The agent transfers data rapidly and writes it into your designated Amazon S3 bucket, Amazon EFS file system, Amazon FSx for Windows File Server file system, or Amazon FSx file system.

File permissions and metadata are preserved when copying objects and or data between Amazon S3, Amazon EFS, Amazon FSx for Windows File Server, Amazon FSx for Lustre, Amazon FSx for OpenZFS, or Amazon FSx for NetApp ONTAP.

When copying data to Amazon S3, DataSync automatically converts each file to a single S3 object in a 1:1 relationship, and preserves POSIX metadata from NFS shares or HDFS as Amazon S3 object metadata. When you copy objects containing file system metadata back to file formats, the original file metadata (that DataSync copied to S3) is restored.

Integration with AWS Infrastructure and Management Services

DataSync works natively with AWS security, monitoring, and audit services to simplify data movement and to provide a consistent management experience for your IT, storage, and DevOps teams. In addition to integrations with Amazon S3, Amazon EFS, and Amazon FSx, DataSync supports AWS Virtual Private Cloud (VPC) endpoints (powered by AWS PrivateLink) to move files directly into your Amazon VPC. Like other AWS services, you can use AWS Identity and Access Management (IAM) to securely manage DataSync access. Similarly, you can configure an IAM role to control the services accessing your Amazon S3 bucket.

Monitoring and Auditing

DataSync task reports provide JSON-formatted output files that include a summary and detailed reports for all files transferred, skipped, verified, and deleted, enabling you to easily verify and audit the data transfer operations for each task execution. Task reports are generated after the completion of your transfer tasks and they are stored in your Amazon S3 bucket. This allows you to easily use AWS services such as AWS Glue, Amazon Athena, and Amazon QuickSight to automatically catalog, analyze, and visualize task report output to check the progress of your data transfers across all task executions. Task reports simplify tracking and auditing, enabling you to easily understand common task execution trends or failure patterns, and gain critical insights into your data transfer processes.

With Amazon CloudWatch, you can monitor the status of any DataSync transfers currently in progress and check previous data transfer history. With CloudWatch Metrics, you can see the number of files and amount of data copied. Consult CloudWatch Logs for information about individual files transferred at a given time, as well as the results of DataSync integrity verification. This simplifies monitoring, reporting, and troubleshooting, enabling you to provide timely updates to stakeholders. In addition, CloudWatch Events are triggered as your transfer tasks complete, enabling automation of dependent workflows. For audit purposes, you can consult AWS CloudTrail, which logs all actions performed by DataSync.

Pay-As-You-Go Pricing

With AWS DataSync, you pay only for data copied by the service at a flat, per-gigabyte rate. No software licenses, contracts, maintenance fees, development cycles, or hardware are required. This provides a lower total cost of ownership (TCO) compared to manually building, operating, and optimizing your own high-performance scripted transfers, as well as lower total cost than buying and running commercial transfer tools.

Using AWS DataSync Discovery, you can run discovery jobs for up to 31 days and receive recommendations free of charge. DataSync Discovery keeps collected data and associated recommendations for 60 days following job completion.

Learn more about DataSync pricing
Learn more about DataSync pricing

DataSync Pricing is simple, and based on how much you data you transfer.

Learn more 
Create a free account
Sign up for a free account

Instantly get access to the AWS Free Tier. 

Sign up 
Start building in the console
Start building in the Console

Get started building with DataSync in the AWS Console.

Sign in