AWS Storage Blog

AWS re:Invent recap: Accelerate your migration to Amazon S3

The need to migrate data securely, efficiently, and quickly to Amazon S3 from your on-premises data centers is not without its challenges. If the dataset is relatively small, there are many open source or free tools available to accomplish this task. But, as the proliferation of data continues to challenge IT professionals in nearly every industry, the need to move large amounts of data to Amazon S3 has become increasingly demanding. Many open source or free solutions may not be suitable for large complex migrations due to the burden they place on IT professionals trying to automate the migration of complicated and ever-changing data systems.

To help our customers reduce complexity and increase reliability in handling these large-scale migrations, we recommend using AWS DataSync and AWS Storage Gateway. Yesterday, I presented a re:Invent session focusing on how you can “Accelerate your migration to Amazon S3.” You can now watch that 30-minute session on-demand. In this blog, I provide some background recapping my session at re:Invent 2020-2021.

AWS DataSync and AWS Storage Gateway overview

AWS DataSync is an online data transfer service that simplifies, automates, and accelerates moving data between on-premises storage systems and AWS Storage services, in addition to between AWS Storage services. You can use DataSync to migrate active datasets to AWS, archive data to free up on-premises storage capacity, and replicate data to AWS for business continuity. DataSync is also commonly used to transfer data to the cloud for timely data analysis and processing. Currently, AWS DataSync can copy data between Network File System (NFS) shares, Server Message Block (SMB) shares, self-managed object storage, AWS Snowcone, Amazon S3 buckets, Amazon Elastic File System (Amazon EFS) file systems, and Amazon FSx for Windows File Server file systems.

AWS Storage Gateway is a hybrid cloud storage service that provides on-premises applications access to virtually unlimited cloud storage using NFS, SMB, iSCSI, and iSCSI-VTL interfaces through file, tape, and volume gateways. You can use the service for backing up and archiving data to AWS, providing on-premises file shares backed by cloud storage, and providing on-premises applications low latency access to in-cloud data.

Addressing the challenges

When discussing some of the challenges our customers face, their primary need is to simplify their data migration projects all the while ensuring that the data is migrated quickly and securely. By leveraging either AWS DataSync, AWS Storage Gateway, and in some cases using both services, our customers have been able to achieve their goals.

We see a few common use case patterns in our customers’ environments. The first use case is for those customers looking to send their backups to the cloud. For this use case, AWS Storage Gateway is deployed as a Virtual Tape Library (VTL) and often can be a drop in replacement for existing tape libraries. Tape Gateway enables you to replace using physical tapes on premises with virtual tapes in AWS without changing existing backup workflows. Backups are sent automatically to Amazon S3, in addition to Amazon S3 Glacier or Amazon S3 Glacier Deep Archive.

The second use case scenario centers around customers needing to migrate data to Amazon S3 on an ongoing basis. Oftentimes our customers need to get data into AWS to perform additional processing or analytics due to the scaling capabilities within AWS. In some of these cases, the source data changes frequently, and ensuring changes are captured via custom scripts proves to be burdensome for our customers. For this challenge, AWS DataSync provides a number of key features to ease the migration burden for our customers. By connecting directly to our customers’ unstructured file systems or even on premises object storage, AWS DataSync offers the ability to schedule transfers, perform incremental transfers, and filter using both include and exclude filters. For data security, AWS DataSync also provides end-to-end data verification and encryption.

A third common use case we see is when customers are looking to migrate datasets to the cloud while maintaining NFS or SMB access to their data from on-premises systems. For this use case, we take a hybrid approach where, first, we use AWS DataSync to migrate the data easily and efficiently to Amazon S3. Then we deploy AWS Storage Gateway in File Gateway mode on premises to provide access to the migrated data back to the applications or users via NFS or SMB protocols.

Conclusion

My re:Invent session on accelerating your migration to Amazon S3 provided some insights into how the AWS DataSync and AWS Storage Gateway services function. I also explored how you can use these services in your environment to help accelerate your migration to Amazon S3.

AWS DataSync and AWS Storage Gateway enable our customers to simplify their data migration projects, ensuring fast, efficient, and secure data transfer to and from Amazon S3. Once in Amazon S3, customers can take advantage of a more cost-effective storage solution for archiving their data, and they can process their datasets using the scalable infrastructure offered by AWS.

If you are looking to get some hands-on experience with AWS DataSync and AWS Storage Gateway, look at these links to get started:

Feel free to leave any comments or questions about my re:Invent session or this blog post in the comments section. Thank you for reading!