AWS Partner Network (APN) Blog

Migrate and Modernize Db2 Databases to Amazon EKS Using IBM’s Click to Containerize Tool

By George Baklarz, Principal Solution Architect – IBM
By Phil Downey, Principal Data Solutions Architect – IBM
By Mark Brown, Sr. Partner Solutions Architect – AWS

IBM-AWS-Partners-5
IBM
Connect with IBM-2

Many IBM Db2 customers are exploring modernizing their databases by moving them into a containerized environment such as Red Hat OpenShift, Amazon Elastic Kubernetes Service (Amazon EKS), or IBM’s Cloud Pak for Data.

What usually makes a database administrator (DBA) nervous is the complexity involved in moving a database into this environment. Major steps include:

  • Upgrading the Db2 database to version 11.5.x since that is the release level of the Db2U OpenShift operator
  • Learning how to work in the containerization environment
  • Rebuilding the database with export and import commands

Getting to the end goal of containerizing your database takes a lot of time and effort. To streamline this process, IBM has developed a new utility called Db2 Click to Containerize (also known as Db2 Shift) to help customers move their databases quickly and easily from an on-premises Db2 running Linux to OpenShift, Amazon EKS, Red Hat OpenShift on AWS (ROSA), and IBM’s Cloud Pak for Data products on AWS.

Some benefits of Db2 Shift are:

  • Automated, fast, and secure movement of Linux databases to a modernized cloud environment
  • Reduction in time to containerize database workloads
  • Ability to move a database without the need to unload, export, decrypt, or back up the database
  • Automatic upgrades from Db2 version 10.5 and 11.1 to the latest version (11.5.7)
  • Shifting of all database settings and objects, including external functions located in the Db2 library path
  • Row, columnar, and encrypted databases can be moved
  • OLTP, SMP, and MPP databases can be moved (excluding pureScale installations at this time)
  • Easy setup of HADR servers for staged migration

In addition to directly shifting the database from one location to another, the Db2 Shift program provides the ability to clone a database for future deployment. This is useful for environments where the target server is air-gapped, or unavailable for direct connection from the source server.

Finally, the Db2 Shift program has two modes of operation. For expert users, the Db2 Shift command can be issued with the appropriate options and run directly from a command line or script. For users requiring more help, the program can be run in an interactive mode, with detailed instructions and help for the various shift scenarios.

Database Requirements

A Db2 database can be moved if the following conditions are met:

  • Database resides on a Linux server (X64 or Power Linux LE)
  • Database was created with automatic storage
  • Database is an OLTP, SMP, or MPP system
  • Row or column mode storage, including encrypted databases
  • Mirror, archive, and overflow logs use disk only
  • User-defined functions/procedures located in the SQL library directory
  • Db2 version 10.5, 11.1, or 11.5 servers can be moved and upgraded at the same time

The following features are not currently supported:

  • pureScale feature (not available yet in the Db2U container)
  • Text extender

There are a few configuration settings which cannot be shifted:

  • Only databases created with automatic storage are supported
  • System contains external procedures which are not in the standard Db2 library; these will need to be manually recreated and cataloged
  • The LOGARCHMETH1/2 setting only supports DISK as a target in Db2U
  • Database encryption keys will be moved to the new location, but if the target already has encrypted databases then you need to manually migrate the encryption key to the target location

Program Overview

The Db2 Shift program is a single executable Linux program (8M in size) that can be installed in any directory. The program itself is self-contained and does not require any additional libraries to run. It can be removed by simply deleting the file.

The following Linux environments have been tested as source systems:

  • Linux X64, CentOS (6,7,8), CentOS Stream, Red Hat (6,7,8) Ubuntu 18.04, 20.04, SUSE 11.4
  • Power Linux LE RHEL 8 source and target

The program provides options to perform the following operations:

  • Shift a Db2 database to OpenShift, Amazon EKS, or CP4D
  • Shift a Db2 database to another Db2 instance
  • Create a cloned copy of the Db2 database for later deployment
  • Deploy a clone into an OpenShift, Amazon EKS, or CP4D container
  • Deploy a clone into another Db2 instance
  • Initialize HADR between source and target pod
  • Initialize HADR between source and target instance
  • Initialize DMC and LDAP authentication for pod
  • Copy cloned databases to a pod

The shifting of a database from one environment to another requires connectivity between the servers. The process by which Db2 Shift moves data requires a connection to either an OpenShift, Amazon EKS, or CP4D cluster, a serverless Secure Shell (SSH) connection, or a local connection.

The Db2 Shift operation must take place under the user ID of the instance owner of the database being shifted. The instance owner must also have SSH serverless connectivity to the target system if a Db2-to-Db2 instance shift is being performed.

To access the target pod in a cluster, the user must have authenticated to OpenShift or Kubernetes and have access to the pod that Db2 is running on.

Db2 Shift User Interface

Db2 Shift can be run either as a traditional command line utility, or as an application with a menu system. The Db2 Shift menu system provides an easy-to-use interface for generating the appropriate shift commands and includes extensive help on the various parameters that need to be supplied.

The Db2 Shift user interface (UI) is based on a character-based terminal display, like VT100 or 3270 technologies. Using a character-based display format eliminates the need for a graphic display environment (GDM) and significantly reduces the memory requirements for the program.

When using a terminal-based UI, only keyboard entries can be used to navigate the screen (instead of mouse movements).

Main Screen

The main Db2 Shift screen provides access to all scenarios mentioned in the previous section.

IBM-Click-to-Containerize-1

Figure 1 – Main screen with major option categories.

The Db2 Shift Help provides a general overview of the Db2 Shift program and details on every Db2 Shift scenario. The keyboard help provides a quick guide on how the keys work, and the syntax details provide information on every setting that Db2 Shift uses.

Help Panels

Help screens are available throughout the program to help guide you through the shift scenarios. On most panels, pressing ^F (CTRL+F) on a field will display detailed information about that field.

Shift to Db2U on OpenShift or EKS

The following screen is used to shift a Db2 database to a Db2U pod that is running on OpenShift, Amazon EKS, or IBM Cloud Pak for Data.

IBM-Click-to-Containerize-2

Figure 2 – Specification screen for shift to OpenShift or EKS.

When the user has entered all of the required information, pressing the review key will display the command line version of Db2 Shift that you could use instead.

Summary Screen

Once the user hits the execute key on the summary screen, the Db2 Shift utility will begin the process of shifting your database.

IBM-Click-to-Containerize-3

Figure 3 – Summary of specifications for the shift.

Db2 Shift Execution

Various messages will be displayed during the execution of the Db2 Shift command with progress bars indicating the current step in the process. When the execution completes, the UI will be displayed with a success indicator and the contents of the log file.

IBM-Click-to-Containerize-4

Figure 4 – Shift results shown with diagnostic test results.

The log file is displayed below the run status. You can scroll through this list to determine what steps failed during the Db2 Shift execution.

A successful run will also display the log file.

IBM-Click-to-Containerize-5

Figure 5 – Log file for the Shift run.

Database Analysis

Shifting a database from an instance to a pod requires the Db2 Shift program to check the source database meets certain criteria including:

  • Source database must be 10.5, 11.1, or 11.5
  • Only databases created with automatic storage are supported
  • System contains external procedures which are not in the standard Db2 library; these will need to be manually recreated and cataloged
  • The LOGARCHMETH1/2 setting only supports disk as a target in Db2U
  • Database encryption keys will be moved to the new location, but if the target already has encrypted databases then you need to manually migrate the encryption key to the target location

This checking is done when the Db2 Shift program begins execution. Even though the database may meet these requirements, the source database environment may have certain settings which need to be present at the target location.

By default, all database settings are moved to the target location. However, none of the instance settings are moved during the shift step unless you explicitly name them.

When the analyze function is selected, the Db2 Shift program gathers information from the source and target databases and presents a report containing the settings that are different between the environments.

IBM-Click-to-Containerize-6

Figure 6 – Comparison of settings, with differences in red.

This example shows many of the errors that can be reported by the analysis step. The items in red above will stop a shift from occurring, while those in yellow are features which might cause an issue when the database is started in the target location.

Details of the setting are available by pressing ^F while the cursor is on the line of the configuration parameter (next page).

IBM-Click-to-Containerize-7

Figure 7 – Showing a setting detail text explanation.

Some fields will have additional help available through a web link.

IBM-Click-to-Containerize-8

Figure 8 – Web link for help shows in blue.

If you have access to a mouse, you’ll be able to click on the link in the field help display and have a web page display with more details on the parameter.

Performance

The Db2 Shift utility uses multiple threads to parallelize the movement of the data to the target location. Following are some general observations from early testing.

The number of threads used by the Db2 Shift utility can be set by the user. Using a higher number of threads is useful in situations where the Db2 database has multiple paths defined for a storage group. If the database was created with a single storage path, threads are competing for I/O on the same device.

Increasing the number of threads will not necessarily result in more throughput. Testing has shown that four threads is a good starting point for parallelism, with small incremental benefits as you increase the number. You need to balance the CPU usage by the shift utility and the impact on workloads running on the server.

The following test system was used to measure the throughput of the Db2 Shift utility.

Test Scenario

The Db2 Shift command was run from a standard Amazon Elastic Compute Cloud (Amazon EC2) instance running Db2 on Linux and shifting the database to an EKS cluster with Db2U installed.

IBM-Click-to-Containerize-9

Figure 9 – Scenario showing move from instance to container.

The database size was approximately 50Gb with 260+M rows of transactional data. The elapsed times of the shift are plotted against the number of threads used for the copying process.

IBM-Click-to-Containerize-10

Figure 10 – Threads versus Shift execution time.

The best elapsed time was 237 seconds (131 seconds copy time only). Increasing the number of threads did not make a difference to the total elapsed time.

Examining the CPU usage, the average utilization of the threads was almost identical between four and eight threads.

IBM-Click-to-Containerize-11

Figure 11 – Caption goes here.

Overlaying the four thread CPU utilization and network throughput demonstrates that the CPUs are busy transmitting as much data onto the network as possible.

IBM-Click-to-Containerize-12

Figure 12 – CPU usage between 4 vs. 8 threads.

The results showed the maximum throughput was capped by the network limit of 5Gb/s. The network throughput was:

  • 1 thread was 1240 Mb/sec
  • 4 threads were 4890 Mb/sec

A CPU thread has a limit on how much data it can push onto the network. By running a test on a single thread, you can determine how many cores you can effectively use during a shift run.

Dividing the network capacity by the single core performance will determine the optimal number of threads to use.

Example:

  • Throughput of one thread is approximately 1.2Gb/s
  • Network limit is 5Gb/s
  • 5/1.2 is approximately 4 threads

This result can also be used to determine your total copy time:

  • Database size/(Network limit/8) = Elapsed time
  • 50 GB/(5Gbs/8) = 50GB/(.625GBs) = 80s with ideal conditions
  • Tests results were 50GB/(4.89Gbs/8) = 128s ideal (131 observed)

The result was that after 237 seconds, a 50Gb database was moved from a traditional Db2 instance into a Db2U pod running on Amazon EKS (Kubernetes).

The amount of effort required to use the database after moving to the new location on EKS was zero. That’s why using Db2 Shift will make it easier for you to modernize your existing Db2 databases onto containerized environments like OpenShift, Kubernetes, and Cloud Pak for Data.

Summary

The Db2 Shift utility provides the ability to quickly, and easily, shift your Db2 Linux database into a containerized environment with a minimal amount of effort. The elapsed time to move a database is dependent on the network bandwidth available, but early tests suggest a transmission rate of approximately 1.37Tb/H on a shared 5Gb network.

Resources

.
IBM-APN-Blog-Connect-2022
.


IBM – AWS Partner Spotlight

IBM Software and Technology is an AWS Competency Partner and leading global provider of enterprise technology and services.

Contact IBM | Partner Overview