AWS Storage Blog
Category: AWS DataSync
Using AWS DataSync to move data from Hadoop to Amazon S3
You want to leverage cloud scalability, increase cost efficiency by paying only for utilized storage, decouple big data storage from processing, and increase capabilities for data analytics and machine learning using AWS. But how do you move your Hadoop cluster? To accelerate this transition, AWS DataSync recently launched support for moving data between Hadoop Distributed […]
Simplify data migrations using an AWS DataSync agent on Linux KVM Hypervisor
UPDATE (1/19/2023): Some readers who followed the steps in this blog post to deploy an AWS DataSync agent on the KVM platform ran into issues, either because the hypervisor host does not support virtualization or it is not enabled on the platform. Therefore, I have added the steps to verify whether the hypervisor host supports […]
How to securely share application log files with third parties
What do we do when our applications fail, and we must provide instance-level log data to external entities for troubleshooting purposes? It’s best to limit direct human interaction with our production resources, so we often see temporary access provided for a fixed period. For highly regulated industries, the approval process for production access can be […]
Considering four different replication options for data in Amazon S3
UPDATE (2/10/2022): Amazon S3 Batch Replication, which is not covered in this blog post, launched on 2/8/2022, allowing you to replicate existing S3 objects and synchronize your S3 buckets. See the S3 User Guide for additional details. UPDATE (5/1/2023): Updated the comparison table to reflect the latest capabilities of the mechanisms covered in the table. […]
Synchronizing your data to Amazon S3 using AWS DataSync
There are many factors to consider when migrating data from on premises to the cloud, including speed, efficiency, network bandwidth and cost. A common challenge many organizations face is choosing the right utility to copy large amounts of data from on premises to an Amazon S3 bucket. I often see cases in which customers start with a free […]
How to use AWS DataSync to migrate data between Amazon S3 buckets
Update (6/14/2022): The “Copying objects across accounts” section has been updated to reflect the new Amazon S3 Object Ownership feature, an S3 bucket-level setting that you can use to disable access control lists (ACLs) and take ownership of every object in your bucket. You no longer need to configure your cross-account AWS DataSync task to […]
Seamlessly migrate large SQL databases using AWS Snowball and AWS DataSync
Many of our customers use native SQL Server backup and restore features to migrate on-premises SQL Server databases to AWS. When using the native SQL Server backup and restore functionality, you can simplify the database migration process by performing a full backup restore on the target SQL instance. In addition, with the help of differential […]
Event-driven data transfer to container-shared storage on AWS
Businesses using data lake solutions built on Amazon S3 often want their data science teams to have access to that same data for machine learning or analytics projects deployed on tools like RStudio Server and Shiny. To do so, they can easily deploy these tools in the cloud using Amazon ECS or Amazon EKS serverless containers with AWS Fargate, and can access […]
Optimize file storage migration to AWS using AWS DataSync and Amazon FSx
When migrating from on premises to the cloud, a wide spectrum of customers face starkly different starting points. Some customers may have one or two workloads stored on premises, while others may have several storage arrays across several data centers, and others have even more intricate or vast setups. If you are migrating to the […]
Storage options and designs for VMware Cloud on AWS
VMware Cloud on AWS is a jointly engineered solution by VMware and AWS that brings VMware’s Software-Defined Data Center (SDDC) technologies to the global AWS infrastructure. If you have workloads with varying storage requirements, it’s important to understand the storage options available and how they could work best for different scenarios. The service offers VMware […]