Category: DMS


Set Up AWS DMS for Cross-Account Migration

by Hemant Borole | on | in DMS, Migration | | Comments

Hemant Borole is a big data consultant at Amazon Web Services.

AWS Database Migration Service (AWS DMS) helps you migrate databases to AWS easily and securely. AWS DMS supports homogenous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora.

For customers who use multiple AWS accounts, the source and target databases may exist on different accounts. For AWS DMS to work with source and target databases in different accounts, it must be able to communicate across accounts.

The following are some of the options you can use to enable cross-account AWS DMS communication:

  • Making database endpoints accessible over the Internet
  • Virtual Private Cloud (VPC) peering
  • Whitelisting the AWS DMS public IP in the security group

Exposing database endpoints over the Internet is not a secure option because it makes the databases vulnerable to security attacks.

VPC peering is a commonly adopted practice. A VPC is a virtual network (block of CIDR) that is allocated to an AWS account. VPC peering allows two VPCs to connect as if they are on the same network. However, this means that both VPCs must have CIDR blocks that do not overlap. Customers might have compliance restrictions that prevent the VPCs from peering.

Solution overview

The use case discussed in this blog post requires the customer to continuously replicate data changes on Amazon Aurora to Amazon Redshift. We use AWS DMS to achieve this replication. AWS DMS uses replication endpoints to migrate data from a source endpoint to a target endpoint. The source and target endpoints can reside in the same Amazon account or in different Amazon accounts. (The setup of the databases is not in the scope of this post.)

AWS DMS uses a replication task to manage replication between the source and the target endpoint. You can configure the replication to be a one-time migration or an ongoing replication. The task also can restrict the replication to certain schemas and tables from the source database. With the AWS DMS task settings, you can configure schemas and tables to be included or excluded from replication.

The AWS DMS task uses rules mapping, which is a JSON definition that allows you to configure this mapping. Here is an example of the rules mapping.

{
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-action": "include",
      "object-locator": {
        "schema-name": "%",
        "table-name": "%"
      },
      "rule-name": "1"
    }
  ]
}

(more…)

Migrating Oracle Database from On-Premises or Amazon EC2 Instances to Amazon Redshift

by Ballu Singh and Pubali Sen | on | in DMS, Migration, Redshift, Schema Conversion Tool (SCT) | | Comments

Ballu Singh and Pubali Sen are solutions architects at Amazon Web Services.

AWS Database Migration Service (AWS DMS) helps you migrate databases to AWS easily and securely. The AWS Database Migration Service can migrate your data to and from most widely used commercial and open-source databases. The service supports homogenous migrations such as Oracle to Oracle. It also supports heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora or Microsoft SQL Server to MySQL. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.

Data replication with AWS Database Migration Service integrates tightly with the AWS Schema Conversion Tool (AWS SCT), simplifying heterogeneous database migration projects. You can use AWS SCT for heterogeneous migrations. You can use the schema export tools native to the source engine for homogenous migrations.

In this blog post, we focus on migrating the data from Oracle Data Warehouse to Amazon Redshift.

In the past, AWS SCT couldn’t convert custom code, such as views and functions, from Oracle Data Warehouse to a format compatible with the Amazon Redshift. To migrate views and functions, you had to first convert the Oracle Data Warehouse schema to PostgreSQL. Then you’d apply a script to extract views and functions that are compatible with Amazon Redshift.

After an update based on customer feedback, we’re happy to let you know that with AWS SCT and AWS DMS, you can now migrate Oracle Data Warehouse to Amazon Redshift along with views and functions.

The following diagram illustrates the migration process.

(more…)

Events and Notifications in AWS Database Migration Service

by Eran Schitzer | on | in DMS | | Comments

Eran Schitzer is a product manager at Amazon Web Services.

We’ve recently added a new feature in AWS Database Migration Service (AWS DMS)—the ability to receive DMS events notifications, such as email messages, text messages, or calls to HTTP endpoints, through Amazon Simple Notification Service (Amazon SNS).

You now can subscribe and receive notifications for two types of events—events related to DMS instances and events related to replication tasks. Events related to DMS instances include those for availability, configuration change, creation, deletion, and maintenance. For example, when a DMS instance goes down for maintenance, a notification is triggered.

Events related to replication tasks include those such as start, pause, finish, Full Load completed, CDC started, and many more. For example, when a migration task finishes to migrate the entire data, it will trigger a “Full Load completed” notification. If the task is configured to follow Full Load mode with CDC mode (that is, replicate the changes in the data since the Full Load began), a “CDC started” notification is triggered next.

In addition, AWS DMS groups events into categories that you can subscribe to using the AWS DMS console or the AWS DMS API. This subscription means you can be notified when an event occurs in the category you subscribed to. For example, if you subscribe to the creation category for a given replication instance, you are notified whenever a creation-related event occurs that affects your replication instance, such as a replication instance is being created.

The following list represents the possible categories for subscription for the DMS replication instance at this time:

  • Configuration change—a replication instance configuration is being changed
  • Creation—a replication instance is being created
  • Deletion—a replication instance is being deleted
  • Maintenance—offline maintenance of replication instance is taking place
  • Low storage—the free storage for the replication instance
  • Failover—failover for a Multi-AZ instance, when enabled, has begun or finished
  • Failure—the replication instance has gone into storage failure or has failed due to incompatible network

The following list represents the possible categories for subscription for the DMS replication task at this time:

  • State change—the replication task has started or stopped
  • Creation—the replication task has being created
  • Deletion—the replication task has been deleted
  • Failure—the replication task has failed

For a list of the events and event categories provided by AWS DMS, see AWS DMS Event Categories and Event Messages in the documentation.

To subscribe to AWS DMS events, do the following:

  1. Create an Amazon SNS topic. In the topic, you specify what type of notification you want to receive and what address or number the notification will go to.
  2. Create an AWS DMS event notification subscription using the AWS Management Console, AWS CLI, or AWS DMS API.
  3. When you receive an AWS DMS approval email or SMS message to the address you submitted with your subscription, click the link in the approval email or SMS message to confirm your subscription.

When you have confirmed the subscription, the status of your subscription is updated in the AWS DMS console’s Event Subscriptions section.

You then begin to receive event notifications.

For more information about table mapping using the console, see the DMS documentation.

For more information about AWS Database Migration Service in general, see our website.

Reduce Resource Consumption by Consolidating Your Sharded System into Aurora

by Ed Murray | on | in Aurora, DMS | | Comments

Ed Murray is a manager at Amazon Web Services.

When faced with the prospect of scaling your relational workload, you have generally had two options available: scale up or scale out. Scaling up is easy—just buy a bigger database host. Scaling out is more difficult. It requires you to shard your database into several physically independent pieces that can each run on its own host.

Despite the difficulty, it has been a recent trend for people to scale out. Availability of commodity hardware coupled with the increased need for system resources has made the effort required to shard a workload worthwhile. One drawback of a sharded system is the added administration costs: If you have four shards, you have four databases to manage. Even so, in many cases scaling out was more cost-effective than scaling up, especially with the advent of managed database services such as Amazon RDS that virtually eliminate the administrative requirements of a relational database.

But wait—what’s with the word was? With the advent of Amazon Aurora, scaling up might be back on the table. Amazon Aurora is an incredibly scalable, MySQL compatible, managed database service. Aurora offers instance sizes from 2 vCPUs and 4 GiB of memory all the way up to 32 vCPUs and 244 GiB of memory. Amazon Aurora automatically grows storage as needed, from 10 GB up to 64 TB. You can also add up to 15 low latency read replicas across three Availability Zones to further scale read capacity. The best part about that is, the storage is shared with your read replicas!

You might, in fact, save bunches of money by consolidating your sharded system into a single Aurora instance or fewer shards running on Aurora. That is exactly what this blog post is all about.

Getting initial data into Aurora for sharding
Most sharded systems follow a standard pattern: Data is divided based on some key, such as customer ID, and distributed across shards by using some mapping function. There are variations—for example, some systems keep reference data in a separate system or in a single shard, and some replicate the reference data to all shards. No matter how data is divided, the complexity in sharding typically lies in the application layer, making the consolidation of shards a relatively easy exercise.

Now you’re probably thinking, “Okay sure, but exactly how do I get my data in to Aurora?” For discussion purposes, let’s assume you are running on MySQL and have a system that consists of four clean shards. By clean, I mean no shard contains data that technically belongs in another shard. Let’s also assume that one shard contains nonsharded reference data. Your sharded system looks something like the following diagram—the “map” shown represents the application mapping traffic to each individual shard.

ShardMap

(more…)

How to Migrate Your Oracle Database to Amazon Aurora

by Ed Murray | on | in Aurora, DMS, Migration, Schema Conversion Tool (SCT) | | Comments

Ed Murray is a manager at Amazon Web Services.

This blog post gives you a quick overview of how you can use the AWS Schema Conversion Tool (AWS SCT) and AWS Database Migration Service (AWS DMS) to facilitate and simplify migrating your commercial database to Amazon Aurora. In this case, we focus on migrating from Oracle to the MySQL-compatible Amazon Aurora.

Changing database engines can be daunting. However, the value proposition of a highly scalable, cost-effective fully managed service such as Amazon Aurora can make the challenge worth it—especially when there are tools to simplify the process. When migrating your database from one engine to another, there are two major things to consider: the conversion of the schema and code objects, and the migration and conversion of the data itself. Fortunately, AWS has tools to facilitate both the conversion and migration of your database.

The AWS Schema Conversion Tool helps simplify heterogeneous database migrations by automatically converting your source database schema and a majority of the custom code to a format compatible with your new target database. The custom code that the tool converts includes views, stored procedures, and functions. Any code that the tool cannot convert automatically is clearly marked so that you can convert it yourself. The AWS Database Migration Service helps you migrate your data easily and securely with minimal downtime.

Great! So where do we begin?

Working with AWS SCT
Usually, the first step in every migration is an assessment of feasibility and effort. You can use AWS SCT to generate a high-level overview of the effort required to convert your Oracle database to Aurora. The SCT runs on a number of operating systems. For the purposes of this blog, we’ll run the tool on Windows. To download a copy of the SCT, see the Installing and Updating the AWS Schema Conversion Tool documentation. To find overall documentation for SCT, start with What Is the AWS Schema Conversion Tool?

Although this blog post won’t cover the installation and configuration of the SCT, it’s important to point out that you need to install drivers for Oracle and MySQL to connect the SCT to your source and target databases. After connecting to your Oracle source database, you can right-click on any given schema and generate an assessment report. The assessment report tells you at a very high level how much of the schema can be converted automatically from Oracle to Aurora and the work that is left after conversion. You can find an example report following.

AssessmentReport

(more…)

Cross-Engine Database Replication Using AWS Schema Conversion Tool and AWS Database Migration Service

by Prahlad Rao | on | in DMS, Migration, Schema Conversion Tool (SCT) | | Comments

Prahlad Rao is a solutions architect at Amazon Web Services.

Customers often replicate databases hosted on-premises to the AWS Cloud as a means to migrate database workloads to AWS, or to enable continuous replication for database backup and disaster recovery. You can both migrate and perform continuous replication on the AWS Cloud with AWS Database Migration Service (AWS DMS).

Although moving from on-premises to the cloud using the same database engine is a common pattern, increasingly customers also use different engines between source and target when replicating. This pattern especially occurs where the source is a commercial database engine and the target is open source. Such an approach can save on licensing costs. In other use cases, the target database might be used as read-only for a reporting application that is compatible with a certain database engine, or the application team is comfortable using an open source database for their needs. Or the customer might be migrating from a commercial database to an open source database on the AWS Cloud for cost and flexibility purposes. This blog post will walk you through the process of replicating a Microsoft SQL Server database to PostgreSQL using AWS Schema Conversion Tool (AWS SCT) and AWS DMS.

AWS DMS helps you migrate databases to AWS with relative ease and security. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The service supports homogenous migrations (like to like databases) and also heterogeneous migrations (between database platforms such as Oracle to Amazon Aurora or SQL Server to MySQL). AWS DMS can also be used for continuous data replication with high availability.

AWS SCT makes heterogeneous database migrations easier by automatically converting source database schema. AWS SCT also converts the majority of custom code, including views and functions, to a format compatible with the target database.

Heterogeneous database replication is a two-step process:

  1. Use AWS SCT to convert the source database schema (from SQL Server) to a format compatible with target database, in this case PostgreSQL.
  2. Replicate data between source and target using AWS DMS.SCT5

(more…)

Modifying Tasks and Table Mapping in the Console in AWS Database Migration Service

by Eran Schitzer | on | in DMS | | Comments

Eran Schitzer is a product manager in Amazon Web Services.

We’ve recently added two features in AWS Database Migration Service (AWS DMS)—modifying tasks and table mapping through the AWS Management Console.

These two new features represent a new step in our long journey to improve and enhance our migration and replication service based on the feedback we receive from our customers.

Let’s outline some highlights of modifying tasks and table mapping through the console.

Modifying tasks
In the past, users had to use the AWS CLI to modify tasks or delete and recreate the migration task to modify a task’s settings in the AWS Management Console. Now, you can modify most of the settings while the task is in stopped mode by using the console. You modify a task in the DMS console by simply selecting the task and choosing the Modify button. You can also use the AWS CLI or AWS DMS API command ModifyTask.

This new capability gives you the flexibility to make the necessary adjustments to migration or replication projects without the need to delete and recreate a task. For example, you can now change the log level of a DMS task or change the table mapping between source and target databases.

For more information about this new feature, see the DMS documentation.

Table mapping through the console
With table mapping, you can specify the tables from a source database that you want to migrate to a target database. The simplest mapping is to migrate all the tables from a source database to a target without any modifications. You can also use table mapping to specify individual tables in the source database to migrate and the schema to use for the migration. In addition, you can use filters to specify what data from a given table column you want replicated to the target database.

Until this feature’s addition, you performed table mapping by creating and modifying a JSON file. Now, users can now use the console to perform table mapping, including specifying table selection and transformations. In the console user interface, you use the Where section to specify the schema, table, and action (include or exclude). In addition, you can use transformations from the console to rename a schema or table, add a prefix or suffix to a schema or table, or remove a table column.

How to set up table-mapping selection rules for a table
The following procedure shows how to set up selection rules for a table called Customers in a schema called EntertainmentAgencySample. You create selection rules and transformations on the Guided tab, which only appears when you have a source endpoint that has schema and table information.

  1. Sign in to the AWS Management Console and choose AWS DMS.
  2. On the Dashboard page, choose Tasks.
  3. Choose Create Task.
  4. Type the task information, including Task nameReplication instanceSource endpointTarget endpoint, and Migration type. Choose the Guided tab in the Table mappings section.
    CreateTask
  5. In the Table mapping section, choose the schema name and table name. You can use “%” as a wildcard value when specifying the table name. Specify the action to be taken, to include or exclude data defined by the filter.
    StartTask
  6. Specify filter information using the Add column filter and the Add condition links. First, choose Add column filter to specify a column and conditions. Then, choose Add condition to add additional conditions. The following example shows a filter for the Customers table that includes AgencyID values between 01 and 85.
    SelectionRule
  7. When you have created the selection rules you want, choose Add selection rule.
  8. After you have created at least one selection rule, you can add a transformation to the task. To do so, choose add transformation rule.
    TableMappings
  9. Choose the target you want to transform and enter the additional information requested. The following example shows a transformation that deletes the AgencyStatus column from the Customers table.
    Rules
  10. Choose Add transformation rule.
  11. You can add additional selection rules or transformations by selecting add selection rule or add transformation rule. When you are finished, choose Create task.
    Guided

For more information about table mapping using the console, see the DMS documentation.

For more information about AWS Database Migration Service in general, see our website.

Database Migration—What Do You Need to Know Before You Start?

by Ilia Gilderman | on | in DMS, Migration | | Comments

Ilia Gilderman is software development manager in Amazon Web Services

Congratulations! You have convinced your boss or the CIO to move your database to the cloud. Or you are the boss, CIO, or both, and you finally decided to jump on the bandwagon. What you’re trying to do is move your application to the new environment, platform, or technology (aka application modernization), because usually people don’t move databases for fun.

Database migration is a complex, multiphase process, which usually includes assessment, database schema conversion (if you are changing engines), script conversion, data migration, functional testing, performance tuning, and many other steps.

Tools like the AWS Database Migration Service (AWS DMS) and AWS Schema Conversion Tool (AWS SCT), native engine tools, and others can help to automate some phases of the database migration process. The key to success, though, is to make your database migration project predictable. You don’t want any surprises when you are 10 terabytes deep in the data migration phase.

So you are eager to start. But do you have all the information required to finish the project successfully?

This blog post discusses common migration problems that can occur when you move your database to the cloud. In the last section, we will discuss some more detailed information related to an AWS migration.

Plan your talent
Application modernization can be challenging, because there are many moving parts and you need to understand all of them. Even if you are an expert with your source database engine, if you switch engines you need to be an expert in the target database engine also. You are going to move your data from one database to another, which will involve crossing the network, so somebody needs to know about servers, ports, and firewall rules. As previously mentioned, database migration is usually a part of a bigger application modernization project, which means the database migration (even for homogeneous migrations) will involve some changes to application code. A common scenario we’ve all heard of is a 15-year-old, 300,000 line COBOL application, developed by dozens (or even hundreds) of developers, and none of them works in the company anymore. But even if you are lucky and you have code written in a modern language by three developers still proudly working for you, somebody will need to assess what changes to the application code the database migration will require. Finally, depending on the cloud provider, you will need somebody on your team who is familiar with the particular provider ecosystem you have. To make a long story short, you will need your best people on this project.

Plan your time
Database migration projects usually include refactoring application and database code, and also the schema, which is a time-consuming, iterative process. The refactoring process can take anywhere from a few weeks to several months, depending on the complexity of your application and database. If you have hard dates (for example, if you need to vacate your data center), be realistic about what you can achieve. Even if everything goes well, data migration might be slower than anticipated. Be aware that the planning can often take more time than the actual migration!

Understand your database
Understanding your source database is crucial to the success of the migration project.

Again, we can go to the extreme and think about a 15-year-old database that can still support the first version of your product, despite the fact that support for this version was discontinued years ago. We can think about the database that has thousands of tables and dozens of schemas and still has these stored procedures which you wrote when you joined the company more than a decade ago…

But let’s say you are lucky and it’s not that bad. Can you answer one simple question: What is the size of the database you are trying to migrate? You might be surprised—most customers can’t answer this question. The size of the database is important when determining the correct approach to use for your migration project. It also helps to estimate how long the data copy will take. What is even more important is how many schemas and tables you are going to migrate. Knowing the layout of your database can help you define a migration project and allows parallel execution of your migration project. For extra credit, you can try to find out which tables participate in which transactions. If you know this, you can execute the migration of multiple disjointed table sets in parallel.

Another important question to ask is: How many very large tables exist in your database? “Very large” is subjective, but for migration purposes, tables larger than 200 gigabytes and with hundreds of millions of rows might end up being the long tail for your data migration.

Okay, suppose you’ve done some initial digging and now you have data about the size, schema, and tables of the database to migrate. The next question you should ask is: Do we have some “fancy” engine-specific data types in our schemas? Having those usually presents a challenge, especially if you want to switch database engines. Even if you are planning a homogenous migration, many tools won’t support all the data types involved. As usual, checking the documentation is a good start. Later, when you migrate your data, verify all your data looks correct on the other end before you shut down the power at your data center.

Now let’s talk about LOBs – Large Objects. If you don’t have them in you database, lucky you, but keep reading to understand how lucky you are. The bad news is migrating LOBs can be slow. Move time can be longer if you don’t know what the maximum size of the LOBs in your schema are. It can be very painful to move even small numbers of them. Another challenge you might face is that you have LOBs in a table without primary keys or any other unique constraints (perhaps your DBA read some cutting-edge blog 10 years ago that claimed this was a good idea). These tables will cause you some pain if you want changes to them to be part of your change data capture (CDC) stream.

Before we talk how not to overload your source database, it’s important to know that working with database roles, users, and permissions can be a huge time-waster during a migration. Most migration tools require some elevated access to the source database catalog to retrieve metadata, and to read logs and replicate changes. Knowing your database roles and permission model, or at least having the name of the person who knows it and can grant access can help.

Finally, even if you get all the types, schemas, and LOBs right, it’s possible the migration will be slow. In some cases, you won’t be able to afford this slowness and it will cause your migration project to fail. A few factors can affect how long the wait will be for the last row to show up on the other end. Some of them are easy to tune and take into consideration; others can be harder.

The first is how hot or busy your source database is. This is important to understand because most migration tools will put some additional load on your source database. If your traffic is very high, it might be unrealistic to plan a live migration and you should consider alternatives. Usually, compacted databases migrate faster and with fewer issues, so you should find out when your database was last compacted. If the answer is never, think about doing it.

Phew—there are a lot of questions to answer, and a lot of people to work with to get the answers. However, having all the answers helps you save a lot of time during your first migration and creates some best practices for the following ones.

Understand your network
The sad truth is that a database migration requires extensive network knowledge from you or your co-workers. It becomes worse if you are migrating a database that is part of your corporate fleet. Most likely, network access to these databases is managed by several teams (security, networking, and so on). You might need to talk to all of them to safely open a port in the firewall or change routing so the replication server can access the database. You’ll also need information about the network setup your organization has with its cloud provider.

Finally, there is this question: Does the connection between your source and target database have enough bandwidth for the migration? Don’t forget, your organization can have an awesome 10 GB direct pipe to your cloud provider, but that doesn’t mean it’s all yours. Make sure that when you start to move data you know exactly how much of this pipe the migration is going to use, and that you are not taking down mission-critical workloads by kicking the tires migrating your test database.

Understand your requirements
I know you want to see that giant server that runs your database or databases go away for good, and sooner rather than later. But do you know what are you replacing it with? Do you know all the key parameters of your new environment? Setting some of these is a new decision to make. Some you’re inheriting from your current environment, or your app dictates them. The simplest questions are: Can your app afford downtime, and how long can this downtime be? In most fleets, some apps can take 20 to 30 minutes of downtime – some can even have a maintenance window of a few hours. Hint: It is better to have these apps as your first project. You won’t be dealing with CDC (change data capture), and you can use dump and restore or other native tools to migrate your database. Another important question that will affect your approach is this: Do you need your source database to remain available after the migration is done? If so, how do you switch back if you need to?

Even if you are doing a lift-and-shift migration, you have decisions to make about your new platform. You have even more decisions if you are switching engines. For instance, do you know why you chose one engine over another? I hope that it’s not because your lead developer told you, “Dude, this is cutting-edge, everybody is using it” and that you had a more structured conversation about why this specific engine is best for your app, your team, and your timeline.

Another important decision to make is whether all the data needs to migrate. For example, you might have an audit log table that is 97 percent of your database. Yes, you need the logs for compliance reasons, but could they be archived later instead of being part of the live migration? Another example: Do you have some dead tables in your database that supported a feature not offered on the new platform?

Even if all the data needs to move, this is a perfect time to examine alternatives for some of your data. For example, can you make use of NoSQL stores or data warehouses on the other end? In general, it’s always good to keep it simple—that is, move the data first, then transform, restructure, or rearchitect it. That said, it might be that refactoring your database during the migration is the only opportunity to do so. In any case, it doesn’t hurt to plan and analyze your data model, even if you are going to change it after migration is complete.

Finally, after all your preparations, thousands of things can go wrong during and after the migration. Having a contingency plan is always a good idea. Can you flip back? Can you run your app on the old and the new database simultaneously? Can you afford to have some data unavailable for a period of time? Answers to these and some other questions help you shape a better contingency plan.

Create your target schema
There are also questions about the target database schema. Even for homogeneous migrations, it’s better to apply the schema in parts. You need to have your tables and primary keys before you can move data. But for performance and other reasons, you might want to create secondary indexes and apply constraints later. You will also likely want to keep your triggers disabled for the initial load of the data. The point is that somebody needs to prepare your schema to be applied in parts. If you are switching engines, the AWS Schema Conversion Tool mentioned earlier can help with some of this. Also, unless there is a pressing need, it’s better to postpone all schema transformations until the end of the migration.

Migrate to AWS
Migration to every cloud has its own specific steps and requires expertise with the specific cloud ecosystem. You need to understand details of networking, permissions, roles, accounts, and so on. Not only do you need to know how the cloud works in general, but you also have to be very familiar with your specific corporate cloud implementation. This section of the post will cover some of the specifics of migrating to AWS and what help is available, and help you ask the right questions about the benefits and limitations of your new environment.

As I said before, AWS has created a number of tools and support materials to help you to be successful with your migration projects. The first and very important tool is the documentation. Please, please read the documentation for DMS and the documentation for SCT. Of course, you know how to set up supplemental logging and you remember most of the Oracle data types and how they map to the rest of the database engines. But please spend some time reading the documentation. It will save you a lot of time and frustration during the migration. The DMS and SCT forums are another place to find useful information before you start the project.

For many use cases, using Amazon RDS for your target database is a natural choice; you get the benefits of a managed service. Make sure you are familiar with those—things like backups, OS and database patches, security, high availability, scalability, elasticity, integration with the AWS ecosystem, and so on. Using RDS frees you from spending time on these activities.

But there are also limitations. Make sure you are familiar with those—things like storage limits, lack of admin privileges on your databases, and so on. You need to evaluate carefully whether you really do require access to your database host. Of course, your DBA will tell you that the world is going to end the moment SSH access is lost to the database box, but will it really? Some apps need this access, but in most cases the sun will shine the day after your DBA loses database admin privileges, as it does for other customers.

Amazon RDS also offers a variety of managed high availability (HA) features. This is a good point to review your HA requirements. Yes, your current setup includes a few MySQL slaves, but why do you have them? Maybe you can do better and more importantly do with less pain, or maybe this is why your database guy wanted to have the host access? It’s important to understand that some of these decisions can be made after the migration. For example, if you are migrating to a MySQL database, it is perfectly fine to migrate to Amazon RDS Single-AZ MySQL (actually faster) and then convert it to an Amazon RDS Multi-AZ instance, or go further and use Amazon Aurora.

Finally, AWS offers a wide list of migration partners that can help you to make an assessment, scope the work, and even execute your migration projects.

Summary
Database migration projects can be hard, especially your first one, but the benefits of migrating your database to the cloud are significantly greater than the challenges migrations can present. You can make these challenges predictable and less painful through diligent preparation and collecting the necessary information before you start. Thousands of AWS customers have already migrated with or without our help. Some of them took some shortcuts and hit every possible problem during their first migration; others spent some time preparing and flew through their migrations with zero pain.

The checklist below will help you to ask the right questions before you start.

Thanks for reading! Have a successful database migration!

Database migration checklist
1. What is the size of your database?
2. How many schemas and tables do you have?
3. How many really big tables do you have (200 gigabytes or 200 million rows in size)?
4. What do the transaction boundaries look like?
5. Do you have engine-specific)data types that won’t be migrated by your migration tool?
6. Do you have LOBs in your tables, and how large are they?
7. Do all your tables with LOBs have primary keys?
8. How hot (busy) is your source database?
9. What kind of users, roles, and permissions do you have on the source database?
10. When was the last time you vacuumed, or compacted, your database?
11. How can your database be accessed (firewalls, tunnels, VPNs)?
12. Do you know what VPC you want to use in AWS?
13. Do you know what VPC security group you can use?
14. Do you have enough bandwidth to move all your data?
15. Can you afford downtime? How much?
16. Do you need the source database to stay alive after the migration? For how long?
17. Do you know why you preferred one target database engine over another?
18. What are your high availability (HA) requirements?
19. Does all the data need to move?
20. Does it need to move to the same place?
21. Do you understand the benefits offered by Amazon RDS?
22. Do you understand any Amazon RDS limitations which might affect you?
23. What happens to your application after the migration?
24. What is your contingency plan if things go wrong?

How to Script a Database Migration

by Satheesh Subramanian | on | in DMS, Migration | | Comments

Satheesh Subramanian is a software development engineer at Amazon Web Services.

AWS Database Migration Service (DMS) lets you migrate databases to AWS easily and securely. The process consists of setting up a replication instance, source and target endpoints, and a replication task. The replication task runs on the replication instance and migrates data from the source endpoint to the target endpoint.

You can use the AWS DMS console or the AWS CLI or the AWS SDK to perform the database migration. In this blog post, I will focus on performing the migration with the AWS CLI.

If you are a first-time AWS DMS user, then follow these steps to set up your account.

After you have completed the setup, follow these steps to complete a simple database migration.

  1. Create a replication instance.
    Use the following command to create a replication instance with the name dms-instance.
    aws dms create-replication-instance --replication-instance-identifier dms-instance --replication-instance-class dms.t2.medium --allocated-storage 50
    This command creates the replication instance on a t2.medium instance with 50 GB of allotted storage. It uses default values for other parameters. For more configuration options to use when creating a replication instance, see create-replication-instance in the AWS CLI Command Reference.
  2. Describe the replication instance.
    Run the following command to describe the replication instance. The response of this command will include the status of create-replication-instance. This will help you understand the status of the instance creation.
    aws dms describe-replication-instances --filter=Name=replication-instance-id,Values=dms-instance
    Run the following command to save the ReplicationInstanceArn for use in later steps.
    rep_instance_arn=$(aws dms describe-replication-instances --filter=Name=replication-instance-id,Values=dms-instance --query 'ReplicationInstances[0].ReplicationInstanceArn')
    It will take a few minutes to create the replication instance. While that’s in progress, create the source and target endpoint objects.
  3. Create the source and target endpoints.
    Run the following command to create the source and target endpoints. Provide source and target database details like the engine-name, the hostname and port and the username and password.
    aws dms create-endpoint --endpoint-identifier source-endpoint --endpoint-type source --engine-name --username --password --server-name --port
    aws dms create-endpoint --endpoint-identifier target-endpoint --endpoint-type target --engine-name --username --password --server-name --port
    For more advanced options for endpoint creation, see create-endpoint in the AWS CLI Command Reference.
    Run the following commands to save the endpoint ARNs for use in later steps.
    source_endpoint_arn=$(aws dms describe-endpoints --filter="Name=endpoint-id,Values=source-endpoint " --query="Endpoints[0].EndpointArn")
    target_endpoint_arn=$(aws dms describe-endpoints --filter="Name=endpoint-id,Values=target-endpoint" --query="Endpoints[0].EndpointArn")
  4. Test source and target endpoints from the replication instance.
    After the replication instance is active and the endpoints have been successfully created, test connectivity from the replication instance to these endpoints. The following commands will invoke connectivity tests from the replication instance to the database endpoints:
    aws dms test-connection --replication-instance-arn $rep_instance_arn --endpoint-arn $source_endpoint_arn
    aws dms test-connection --replication-instance-arn $rep_instance_arn --endpoint-arn $target_endpoint_arn
  5. Describe connections to the source and target endpoints.
    The describe-connections response will contain the status of the test connection and, in the case of a failed connection, the failure message. After invoking the connectivity test, describe the connections to ensure the tests are successful. If the test connection fails for any reason, it must be fixed and retested.
    aws dms describe-connections --filter "Name=endpoint-arn,Values=$source_endpoint_arn,$target_endpoint_arn"
    Note: A failure message in the describe response will provide details for the test connection failure.
  6. Create a replication task.
    If the test connections are successful, use the following command to create the task:
    aws dms create-replication-task --task-identifier replication-task-1 --source-endpoint-arn $source_endpoint_arn --target-endpoint-arn $target_endpoint_arn --replication-instan+ce-arn $rep_instance_arn --migration-type --table-mappings file:///tmp/table-mappings --task-settings file:///tmp/task-settings
    This command assumes you have table mappings and task settings files in the temp directory (in Linux/Mac OS). For information about how to specify the file input for various platforms, see Loading Parameters from a File in the AWS Command Line Interface User Guide.To decide which values to use for task settings, table mappings, and migration types, see the following:

    For more advanced options for replication task creation, see create-replication-task in the AWS CLI Command Reference.

  7. Describe the replication task.
    Task creation will take a few minutes. After the task is created, describe the task and make sure it is ready to be executed.
    aws dms describe-replication-tasks --filters "Name=replication-task-id,Values=replication-task-1"
    Run the following command to save the replication task ARN for use in later steps:
    replication_task_arn=$(aws dms describe-replication-tasks --filters "Name= replication-task-id,Values=replication-task-1" --query "ReplicationTasks[0].ReplicationTaskArn")
    Run the following command if you want to just retrieve the status of the task:
    aws dms describe-replication-tasks --filters "Name=replication-task-arn,Values=$replication_task_arn" --query "ReplicationTasks[0].Status"
  8. Start the replication task.
    Run the following command to start the task after it is ready to be executed:
    aws dms start-replication-task --replication-task-arn $replication_task_arn --start-replication-task-type start-replication
    For all available options for the start-replication-task command, see start-replication-task in the AWS CLI Command Reference.
  9. Monitor the progress of the replication task.
    After you start the task, it’s very important to monitor its progress. Run the following commands to keep track of the task progress.
    To monitor the overall task-level statistics, run the following command:
    aws dms describe-replication-tasks --filters "Name=replication-task-arn,Values=$replication_task_arn" --query "ReplicationTasks[0].ReplicationTaskStats"
    To monitor the table-level statistics, run the following command:
    aws dms describe-table-statistics --replication-task-arn $replication_task_arn
    To monitor the task status itself, run the following command:
    aws dms describe-replication-tasks --filters "Name=replication-task-arn,Values=$replication_task_arn" --query "ReplicationTasks[0].{Status:Status,StopReason:StopReason}"
  10. Stop the replication task.
    You can stop the migration after data is completely migrated from source to target. Run the following command to stop the migration task:
    aws dms stop-replication-task --replication-task-arn $replication_task_arn
  11. Delete the replication task.
    If you don’t want to keep the task, run the following command to delete it:
    aws dms delete-replication-task --replication-task-arn $replication_task_arn
  12. Delete the source and target endpoints.
    If you no longer need the endpoints, run the following commands to delete them:
    aws dms delete-endpoint --endpoint-arn $source_endpoint_arn
    aws dms delete-endpoint --endpoint-arn $target_endpoint_arn
  13. Delete the replication instance.
    After your migration is complete, run the following command to delete the replication instance:
    aws dms delete-replication-instance --replication-instance-arn $rep_instance_arn

This was a simple migration. For more advanced database migrations, see the AWS Database Migration Service User Guide.

For migration best practices, see AWS Database Migration Service Best Practices.

And in case something goes wrong, see Troubleshooting AWS Database Migration Service Tasks.

Sign in to the AWS DMS console to easily create and manage database migration tasks here: https://console.aws.amazon.com/dms/home

Migrate PostgreSQL Databases and Perform Ongoing Replication with the AWS Database Migration Service

by Jeff Levine | on | in DMS, Migration, RDS PostgreSQL | | Comments

Jeff Levine is a solutions architect for Amazon Web Services.

As a solutions architect for Amazon Web Services, I help our customers to migrate workloads to the AWS Cloud, a key part of which includes their databases. The AWS Database Migration Service (AWS DMS) enables organizations to migrate data to and from a variety of databases located in Amazon Relational Database Service (Amazon RDS), those running on Amazon EC2, and also those running on-premises. The service supports homogeneous migrations such as Oracle to Oracle, and also heterogeneous migrations such as Oracle to Amazon Aurora. You can also customize table mappings and perform translations. In this post, I am going to talk about two things, first about using an RDS PostgreSQL source database for DMS and then about using the recently announced continuous data replication feature, which can keep the target database up to date after the initial migration.

Using an RDS PostgreSQL Source
You might think that migration means just moving data from one location to another, but the process involves far more. The process must also keep up with transactions that take place during the migration, so that the database is left in a consistent state upon completion. Accordingly, when we look at migrating from an Amazon RDS PostgreSQL source database, we need to take into account both how data is moved and also environmental factors that can affect the ongoing processing of transactions while the migration is happening.

Let’s take the case of a PostgreSQL to MySQL migration. I’ll begin with a simple table definition.

CREATE TABLE subscribers (
subscriber_id SERIAL NOT NULL PRIMARY KEY, subscriber_info json NOT NULL
);

I’m going to use pgAdmin, a popular open source administration tool for PostgreSQL, to create the table and display the specification in the source database.

After I run the DMS migration task, here’s how the newly created subscribers table appears in the target database using the program DbVisualizer.

Notice how the serial column is now an integer and the subscriber_info column has been converted to longtext. To understand what happened, we need to consider how DMS operates. DMS functions as an intermediate broker to a variety of different database engines. To make this conversion happen, DMS first maps the source data type (JSON in this case) to an intermediate DMS type and then maps that type to the target data type (longtext). It’s very important to take into account these mappings when designing your migration strategy. Note that I used the default mapping. DMS also allows you to customize the mappings if you want.

We also need to look at source-related performance factors such as replication activity and the use of write-ahead logging (WAL) that can affect the migration. Using Amazon RDS, you can change database parameters using parameter groups. One adjustment we suggest is changing the wal_sender_timeout parameter to 0. To do this, create a parameter group for your RDS PostgreSQL database and adjust the parameter accordingly. Because this is a dynamic parameter, the adjustment will take effect immediately. You can read about all the nuances of using PostgreSQL DMS sources in the RDS documentation.

Using Continuous Data Replication
Additionally, AWS recently announced continuous data replication for DMS, which enables you to keep your database up to date after the initial migration. With this feature, you can now have high availability during the replication process by specifying Multi-AZ when you create your replication instances.

When Multi-AZ is selected, two replication hosts will be created, each of which has its own IP address. Remember to adjust the underlying source and target security groups (or their on-premises equivalents) to grant both instances the appropriate level of access.

The AWS Database Migration Service offers a powerful set of migration capabilities. It is important to consider how the service works behind the scenes as part of the development of an overall migration plan. With continuous data replication, you can now keep your databases up to date after the initial load.  It’s easy to get started. Follow our Getting Started Guide to migrate your database using AWS Database Migration Service with just a few clicks.