AWS DataSync is a data transfer service that makes it easy for you to automate moving data between on-premises storage and Amazon S3 or Amazon Elastic File System (Amazon EFS). DataSync automatically handles many of the tasks related to data transfers that can slow down migrations or burden your IT operations, including running your own instances, handling encryption, managing scripts, network optimization, and data integrity validation. You can use DataSync to transfer data at speeds up to 10 times faster than open source tools. DataSync uses an on-premises software agent to connect to your existing storage or file systems using the Network File System (NFS) and Server Message Block (SMB) protocols, so you don’t have to write scripts or modify your applications to work with AWS APIs. You can use DataSync to copy data over AWS Direct Connect or internet links to AWS. The service enables one-time data migrations, recurring data processing workflows, and automated replication for data protection and recovery. Getting started with DataSync is easy: Deploy the DataSync agent on-premises, connect it to a file system or storage array, select Amazon EFS or Amazon S3 as your AWS storage, and start moving data. You pay only for the data you copy.
Simplify and automate transfers
AWS DataSync makes it easy for you to move data over the network between on-premises storage and AWS. DataSync automates both the management of data transfer processes and the infrastructure required for high-performance, secure data transfer. The service also includes automatic encryption and data. All of this minimizes the in-house development and management otherwise needed for fast, reliable, and secure transfers.
Move data 10x faster
Transfer data rapidly over the network into AWS, up to 10 times faster than is common with open-source tooling. DataSync uses a purpose-built network protocol and a parallel, multi-threaded architecture to accelerate your transfers. This speeds up migrations, recurring data processing workflows for analytics and machine learning, and data protection processes.
Reduce operational costs
You can move data cost-effectively with DataSync’s flat, per-gigabyte pricing. You’ll also save on script development and management costs, and avoid the need for costly commercial transfer tools.
How it works
If you are closing data centers or retiring storage arrays, you can use DataSync to move active data sets rapidly over the network into Amazon S3 or Amazon EFS. DataSync does both full initial copies, and incremental transfers of changing data. It also includes encryption and integrity checking to help make sure your data arrives securely, intact, and ready to use. You can use DataSync to copy active, changing data alongside Snowball Edge for the migration of static data to Amazon S3.
Data processing for hybrid workloads
If you have on-premises systems generating or using data that needs to move into or out of AWS for processing, you can use DataSync to accelerate and schedule the transfers. It can help speed up critical hybrid cloud workflows in industries that need to move active files into AWS quickly, including video production in media and entertainment, seismic research in oil and gas, machine learning in life science, and big data analytics in finance.
If you have large Network Attached Storage (NAS) systems, you likely have a lot of files to protect—either with replication or backup to a second hardware stack. With DataSync, you can replicate files into Amazon S3 for online copies that you can archive to Amazon Glacier with an S3 Lifecycle Management Policy. Or, you can send the data to Amazon EFS for a standby file system.
“At Celgene, our research teams are focused intently on the discovery and development of treatments for cancer and other severe conditions. AWS is an integral part of our innovation process, and for our IT teams that means using as many AWS services as we can, to eliminate the operational and cost burdens of running infrastructure and tooling that distract us from supporting drug discovery. Our labs generate petabytes of data – irreplaceable intellectual property – and we use AWS DataSync to get the data into Amazon S3 and Amazon EFS easily, quickly and cost-effectively. Without the data in AWS, there’s no way we could innovate as fast. AWS DataSync works with my existing storage systems, and efficiently uses as much bandwidth as we can give it to get our data safely into AWS.”
Lance Smith, Director of Research Computing - Celgene
Migrating hundreds of TB of data to Amazon S3 with AWS DataSync
by Satish Kumar & Sona Rajamani |
06 JUNE 2019
Excluding and including specific data in transfer tasks using AWS DataSync filters
by Olga Kogan | 22 MAY 2019
Learn what makes AWS DataSync fast, secure and easy to use as part of your AWS architecture.
Instantly get access to the AWS Free Tier.
Get started building with AWS DataSync in the AWS Console.