What is a Data Migration Framework?
Data migration is moving data from one storage system or computing environment to another. Any data migration initiative aims to move data efficiently while considering factors like network resources, data security, time, and transfer methods. Cloud data migration focuses explicitly on moving data to the cloud.
This process isn't merely about relocating data—it involves accurately mapping it between different storage environments. It can take several forms. For example, you may have to periodically upload data files in batches, stream data from sensors, or implement a one-time migration of an existing archive from on-prem data storage systems.
Goals
Each cloud data migration project requires a clear business case to determine the best outcomes. However, there are a few goals common to most data migrations:
- Increased efficiency, for example, in seeking increased uptime, remote-first infrastructure, or system consolidation.
- Reduced resource expenditure across hardware maintenance, server room operation, and 24/7 on-site systems administrators.
- Foundational data platform for conducting analytics, artificial intelligence, and building enterprise applications.
Other goals may include ensuring systems remain available at their natural end of life, virtualizing all infrastructure, and data integration with existing cloud systems.
Challenges
Successful cloud migration involves more than just transferring files. It requires that:
- Permissions, access controls, and other metadata remain intact.
- Users have uninterrupted access to critical data during uploads.
- Data consistency is maintained despite any network outages
Transferring large data volumes is time-consuming and often requires significant manual intervention. Investing in specialized tools for migration may lead to sunk costs once the transition is complete.
Hence, cloud migration requires planning, scheduling, and the right tools to limit operational overheads and reduce costs. Otherwise, the data migration process could be delayed or even require restarting from scratch.
What are key data migration planning considerations?
Leadership and teams involved in data migration have to consider the following:
- Time taken to migrate data
- Any existing source and destination incompatibilities
- Security considerations during migration
- Cost of migration tools or processes
- Scheduling considerations
- Migration type—batch, streaming, all-at-once
- Impact on network resources.
Steps in planning include:
Assess your data sources
Before moving data, you must assess your current data configurations. The current data, storage, and access method types guide your migration options.
For example, relational databases stored on an onsite MySQL server can be migrated to Amazon Relational Database Service (RDS) with a relatively straightforward process and a one-to-one database management system. However, on-premise legacy systems for ERP might prove more difficult, especially if a digital transformation imperative involves a software change.
Identify and note down the details of all your data sources for cloud migration, like:
- Databases
- Application data
- Storage
- Data models
- Cloud-to-cloud
Design your migration
This involves organizing and configuring migration tools that meet existing security standards. You must also determine the order of data migration operations and schedule them in advance. For example, you can choose from:
- Live replication for automatic, asynchronous object copying until data is synced between both systems.
- Snapshot migration for all-at-once delivery of a full system state, which is then updated with smaller transfers to catch up and align to the current state.
- Phased migration for the migration of smaller datasets one at a time.
Also, plan how to assess migration accuracy and quality at the end.
Brief key stakeholders
Migration can be disruptive to business employees, clients, and partners. Ensure key stakeholders are aware of the data migration process, plans, timelines, and accessibility disruptions during the migration period. Training may also be necessary to ensure administrators know how to configure and users know how to access the data and cloud services post-migration.
Plan and schedule frequent updates throughout the migration process to keep a positive sentiment.
Build and test the solution
Each data migration requires a different strategy. Some types of data migration require a fast, all-at-once transfer of a small amount of data, while some might have a vast amount trickling in over time. How you build and test your migration will depend on the strategy and tools involved. Typically, you will keep using your old systems until you have completed full testing of the new systems to ensure the migration process is complete and correct.
What are data migration strategies?
There are different strategies and methods for uploading data to the AWS cloud using AWS cloud data migration services.
Direct network connections
A direct network connection is a private cabled connection between your router and a cloud-based router. The cloud-based router is at the edge of the cloud provider’s private network, opening you up directly to their range of services.
AWS Direct Connect allows you to use an Ethernet fiber-optic cable for a Layer 3 network connection between your organization and AWS to securely move data from your networks to AWS services. AWS Direct Connect has locations worldwide, where you can set up equipment for data migration.
Steps to get started:
Step 1—Select your direct connect location
Choose an AWS Direct Connect location, determine the connections needed, and select a port size. Multiple ports can be used for increased bandwidth or redundancy.
Step 2—Choose your connection type
Decide between a dedicated or hosted connection. A dedicated connection offers exclusive access with multiple virtual interfaces, while a hosted connection shares the cross connect and provides a single virtual interface.
Step 3—Set up virtual interfaces
Configure one or more logical virtual interfaces (VIF) over your connection. Transit VIFs connect to AWS Transit Gateways, public VIFs access AWS public services via public IPs, and private VIFs connect to Amazon VPC using private IPs.
Device-based data transfer
Large-scale data migrations can be more efficient when moving data to a device and physically transporting it to a data center. AWS Snowball is a service that provides secure, rugged devices you can use to securely upload data to the cloud. The steps are as follows:
1. AWS ships a Snowball Device to your location on request.
2. Connect the device to your network and use the AWS Snowball Client or AWS OpsHub to unlock and configure the device.
3. Copy data onto the device—built-in encryption ensures security during transfer.
4. Ship the device back to AWS using the pre-paid shipping label.
5. Upon arrival, AWS automatically transfers the data to the designated S3 bucket and securely erases the Snowball device.
6. You will receive a notification when the process is complete.
Uploading sensor data streams
Streaming data collected from IoT or industrial devices and sensor networks can be transferred in real time to the cloud instead of being captured and batch processed on-site. Amazon Data Firehose allows you to set up a stream with your data source, transform the data if necessary, and then store it in a range of destination storage services on AWS.
The steps are as follows
Step 1—Create a Firehose stream
A Firehose stream is the core entity of Amazon Data Firehose. You can create it from the AWS console and configure it to receive data directly or from an existing Amazon Kinesis data stream.
Step 2—Send data to the Firehose Stream
Records, up to 1,000 KB in size, are sent by data stream producers to the Firehose stream. Data producers can be applications, servers, or other AWS services.
Step 3—Configure buffering and data processing
Amazon Data Firehose buffers incoming data before delivering it to destinations. You can configure the buffer size (in MB) and the buffer interval (in seconds).
Step 4—Choose a destination and understand data flow
Amazon Data Firehose delivers streaming data to various destinations
- Amazon S3 data is stored in an S3 bucket, with optional backup of transformed data.
- Amazon Redshift is first delivered to an S3 bucket and then loaded into Redshift using the COPY command.
- Amazon OpenSearch Service with an optional backup to S3.
Database migration
Database migration refers to migrating relational databases, data warehouses, NoSQL databases, and other types of data stores in database form. Migration services discover the database types and schemas and directly copy to the same infrastructure or convert to a new target engine.
The AWS Database Migration Service discovers, assesses, converts, and migrates database and analytics workloads to AWS using an automated data migration process. It is highly available and has minimal downtime.
If your data migration case isn’t listed above, you can also try:
- AWS Transfer Family is a suite of secure file transfer services like SFTP
- AWS Storage Gateway is a suite of hybrid on-site and cloud storage solutions
- AWS Glue is a suite of services to discover, prepare, move, and integrate data from various sources
What are data migration best practices?
Some best practices in cloud data migration are given below.
Always have data backups
Always have data backups whether you plan to move data or simply conduct day-to-day operations. Don’t delete your original data before you’re sure the cloud configuration is thoroughly tested and operates as expected - with its own backups.
Ensure all dependencies are mapped and migrated
Data is often attached to various other dependencies and won’t operate correctly without them. To ensure a smooth transition, ensure all dependencies are mapped and migrated along with the original data. User permissions and access controls should be set to the same levels as before migration and reassessed for heightened security when possible.
Double-check security and compliance obligations and configurations
Before, during, and after migration, you must examine security and compliance policies and procedures to determine the right processes and controls to use in migration activities.
Include planning for decommissioning old equipment
Old hardware may still contain recoverable data, even when files and disk spaces have been deleted. To ensure the complete deletion of all data, secure the decommission of old devices, for example, by following the NIST 800-88 Guidelines for Media Santization.
How can AWS support your data migration needs?
At AWS, we’ve developed a complete suite of data migration tools and services to make importing and exporting data easy, safe, and cost-effective. Help is available at each stage of the entire data migration process. Visit AWS Cloud Migration to migrate and modernize with AWS or request a free AWS Optimization and Licensing Assessment today.