AWS Storage Blog

Simplify multicloud data movement wherever data is stored with AWS DataSync

At AWS, we believe that customers get the best experience, performance, and cost when they choose to run their IT operations in the cloud. However, for a variety of reasons, some customers end up in a situation where they’re running their IT operations in a multicloud environment. For example, a customer might have acquired a company that was already running on a different cloud provider, or there is a requirement to utilize specific AWS data processing services, where the data must reside in AWS. These customers can be faced with additional complexity when it comes to operating their applications and cloud infrastructure. They may need to use solutions from multiple providers to provision, manage, and govern IT resources, to monitor the health of their applications, and to collect and analyze data stored in multiple locations. To help customers with these challenges, AWS has been extending its cloud services to help customers streamline, manage, and govern their hybrid and multicloud infrastructure and applications. One such service is AWS DataSync.

AWS DataSync can now move data to and from an expanded set of cloud providers. With DataSync, you can move your object data at scale between S3-compatible storage on other clouds and AWS Storage services such as Amazon S3. In addition to support for Google Cloud Storage, Azure Files, and Azure Blob Storage, DataSync now supports copying data to and from DigitalOcean Spaces, Wasabi Cloud Storage, Backblaze B2 Cloud Storage, Cloudflare R2 Storage, and Oracle Cloud Storage.

In this post, I provide a general overview of configuring AWS DataSync to begin your multicloud data transfers. I will discuss how AWS DataSync connects to different clouds through specific endpoints and provide an overview of differences for each supported cloud. DataSync makes it fast and simple to migrate your data from other clouds to AWS, archive your data in AWS, or move data to and from other clouds as part of your business workflows. With data in AWS, you can leverage AWS’s unmatched experience, maturity, reliability, security, and performance, which you can depend upon for your most important applications.

How it works

AWS DataSync supports moving data between other public clouds and AWS Storage services.

DataSync transfers data to and from other clouds using a DataSync agent. The DataSync agent can be deployed as an Amazon EC2 instance connecting to Google Cloud Storage, Azure Blob Storage, in addition to DigitalOcean Spaces, Wasabi Cloud Storage, Backblaze B2 Cloud Storage, Cloudflare R2 Storage, and Oracle Cloud Storage. Deploying the agent in Google Cloud or Azure can provide compression benefits over the network and reduce egress costs.

Getting started

Start by deploying and activating your DataSync agent. The activation process associates the agent with your AWS account and region. For EC2-based agents, we recommend activating the agent using a private VPC endpoint.

Once the DataSync agent is deployed and activated, you will create a DataSync location for the other cloud’s storage type. When transferring from another S3-compatible object storage in another cloud to AWS, you need to create a DataSync object storage location to be used as a source for a DataSync task. Supported clouds provide a public endpoint with a specific region or account along with credential keys, enabling DataSync to access the specified object storage. The following table lists supported clouds S3-compatible object storage, along with their endpoints and read permissions that DataSync uses to transfer data from the specified cloud to AWS. Review your cloud providers documentation for specific access permission details.

Table 1 lists Cloud endpoints and required read permissions

Using Wasabi Cloud Storage as an example location, you would provide the regional server endpoint for the Wasabi bucket and enter your access key credentials. Once the location is created, you can use your location to transfer data to AWS as part of a DataSync task.

[Alt Text] Wasabi Object Storage DataSync location

Azure Blob Storage does not provide an S3 compatible endpoint. You configure transfers from Azure Blob Storage using a DataSync Azure Blob Storage location type.

Azure Blob Storage DataSync location

Considerations when transferring data between clouds

While AWS DataSync simplifies your multicloud data transfers, you still need to consider the characteristics of other clouds. Azure Blob and Backblaze B2 support object tags, while other supported cloud providers either do not support object tags or querying tags from the S3 interface. DataSync provides a task level option to copy object tags, which should be disabled when copying objects to or from clouds that do not support retrieving tags.

You should also consider the source storage class the objects are being read from. Other cloud providers have a mixture of storage class options that affect egress and request charges when transferring data. DataSync will issue requests to other clouds to compare and read objects to determine changes and data transfer. Like Amazon S3, Azure Blob and Oracle Object Storage have archive storage classes that require objects to be restored before DataSync can read the objects in the archive storage class.

Conclusion

In this post, I discussed scenarios where customers have to manage multicloud environments, such as a merger and acquisition where data is required to move between clouds, or customer requirements for data processing in specific cloud services. I discussed how AWS DataSync now supports transferring data between a wide range of object storage and file services provided by other clouds. I then reviewed a few considerations when planning data transfers between clouds such as obtaining the cloud provider storage endpoint, support for object tags, the importance of knowing what storage class the data is in, and request and egress charges.

Using AWS DataSync, you can simplify and automate your data movement workflows and minimize the difficulties of building a solution that can communicate with multiple cloud providers, making it easier than ever to move data to and from AWS.

Get started today with in-depth AWS documentation and blogs.

Darryl Diosomito

Darryl Diosomito

Darryl is a Senior Solution Architect at AWS. He is focused on helping customers migrate their data to AWS as part of their journey to the cloud. Darryl lives in the New England area and enjoys finding outdoor activities that take advantage of every season.