Creating Database Migration Waves using AWS Schema Conversion Tool
Enterprises are eager to migrate from commercial relational databases (e.g., Microsoft SQL Server, Oracle) to open-source relational databases, such as MySQL or PostgreSQL to save costs, improve data portability, and leverage cloud managed databases. They often find that transitioning relational data between tables is relatively easy. However, migrating stored procedures, indexes, views, and functions can be complex.
The AWS Schema Conversion Tool (AWS SCT) identifies the database attributes that must be changed and the complexity of changes required, as well as provides migration recommendations. Oftentimes, the administrator selects all of the database objects when building the AWS SCT report, which captures all of the complexities of migrating the entire database. The database architect may be overwhelmed and they may recognize that they have constrained development resources or they may incur significant migration costs. Prioritizing migration activities can help empower your team to focus on top priorities with clear incremental steps to accomplish your database migration.
The first objective is to identify the target database that best serves the business needs of the enterprise. With the target database identified, the customer can run the AWS SCT process. The output of the AWS SCT provides a comprehensive list of incompatibilities between the two databases, the recommended method of reconciling the incompatibilities, and the difficulty of implementing the proposed method. An AWS database specialist can define a timeline that addresses the migration of the data from source database to target database.
We’re going to discuss how the AWS SCT will help you plan out tasks and timelines in waves, priorities for migration, and execution planning. Then we’ll discuss how you can scope the database attributes to specific object sets using AWS SCT.
Database wave planning
Databases often serve multiple applications due to data-centric critical mass. Once a database is initialized and populated with some data, it becomes the reference point for future projects. These projects often have discrete data sets leading to unique tables, stored procedures, and indexes. When AWS SCT parses the source database, it has no knowledge of the applications and their correlation to tables, stored procedures, and indexes. Instead, it merely presents the composite set of attributes that must be converted.
The enterprise database architect can define the correlation of the tables, stored procedures, and indexes with specific applications. The AWS database specialist can organize the AWS SCT input selections according to the application dependencies. Therefore, an SCT output will represent a subset of migration activities that are specific to an application. This process is repeated for each application. Although some tables, indexes, and stored procedures may overlap between applications, the majority of the attributes can be migrated independently. The goal is to identify those applications with the highest value, lowest risk, and least complex migration pattern according to the different views of the AWS SCT output. The application of specific AWS SCT output views enables the enterprise architect to build waves of table migration along with the associated stored procedures and indexes. Although the total time required to complete the entire migration may not be compressed, the business value of the migration can be accelerated by strategically picking the most relevant wave of data.
The selective AWS SCT input yields a report that enables the team to decide which set of the database applications should be prioritized for migration. The following table provides a simple framework for decision making:
Migration execution based on chosen priority
Once the prioritization methods have been decided, it’s time to start the hands-on work. The manual schema conversions and incompatible types require developers and database specialists to translate those attributes into the language of the new database engine. The AWS SCT gives the database migration specialist the specifics regarding what must be converted, and a recommendation on how that conversion should take place.
Using AWS SCT to selectively limit object sets
Leveraging naming conventions to scope the analysis
AWS SCT provides the ability to limit which objects you would like to assess/convert/migrate through the use of a database tree view.
One of the first methods of defining the segmentation is by interpreting the naming conventions and patterns used to establish a degree of isolation of database objects. A database that serves multiple applications may leverage a schema design pattern that establishes unique schemas for each application. For example, an application that performs employee management may reference a schema with a prefix such as EMP. Another application dedicated to security badges may reference a table called BADGEINFO and BADGEUSE where the naming convention for badge application tables utilizes BADGE in the table name.
The diagram shows how multiple schemas may be selected in the AWS SCT tree view from the source database.
When naming conventions aren’t enough
It’s entirely possible that through years of development, staff changes, or business needs, the delineation of database objects through the use of dedicated schemas or unique object identifiers wasn’t utilized consistently.
If the database was defined to leverage foreign key/parent-child relationships, then this relationship pattern can be leveraged to provide insight into object dependencies. The following diagram highlights an example of using DB2 LUW system tables to identify the relationship tree:
By adjusting the query to focus on the DB2INST2 schema for this database, the following referential integrity tree is revealed:
Furthermore, by using this list of objects, we can further isolate which objects the AWS SCT mechanisms will assess, convert, or provide migration options.
The above query is provided as an example using the DB2 database, while the defined methods for DB2, SQL Server, and Oracle are represented as follows:
But wait! Not all dependencies are table based!
Identifying dependencies with objects, such as views, stored procedures, and functions, is also achievable. The following query helps to expand the dependency review to include objects that aren’t based on tables:
For the query above, the table name specified as EMPLOYEE was found to be a top-level parent in the prior outputs. Executing this query shows us the following:
The output shows that EMPLOYEE has 14 dependencies ranging from database views to database summary tables.
Using the above insights in conjunction with SCT’s source tree view, it’s possible to isolate database objects that are tied to a specific application or database dependency. Migrating just these objects instead of the entire database can aid in simplifying the move to AWS by breaking up large databases into functional components.
Dealing with shared data
Another situation may arise where a table object is shared across multiple business applications. For example, the aforementioned EMPLOYEE table may be used for both HR related applications and a BADGE application. The analysis thus far has shown that moving the BADGE application to AWS will be immediately beneficial.
For situations like this, establishing a data synchronization strategy may become necessary. Utilizing AWS Database Migration Service (AWS DMS), foreign data wrappers or vendor-provided data replication may be potential options.
The goal of this post was to demonstrate a pragmatic method of creating waves of database migration based on sets of database attributes associated with applications. Although the AWS SCT is often used to provide a composite representation of the work required to migrate the entire database, the method of segmenting the work into waves enables teams with the ability to prioritize the migration based on difficulty, cost, or business impact. We’ve demonstrated how AWS SCT can be used to select those attributes of the database with dependencies, and recommend a means of coordinating the waves using AWS DMS to execute the transition of schema.