Stream mainframe data to AWS in near real time with Precisely and Amazon MSK

This is a guest post by Supreet Padhi, Technology Architect, and Manasa Ramesh, Technology Architect at Precisely in partnership with AWS.

Enterprises rely on mainframes to run mission-critical applications and store essential data, enabling real-time operations that help achieve business objectives. These organizations face a common challenge: how to unlock the value of their mainframe data in today’s cloud-first world while maintaining system stability and data quality. Modernizing these systems is critical for competitiveness and innovation.

The digital transformation imperative has made mainframe data integration with cloud services a strategic priority for enterprises worldwide. Organizations that can seamlessly bridge their mainframe environments with modern cloud platforms gain significant competitive advantages through improved agility, reduced operational costs, and enhanced analytics capabilities. However, implementing such integrations presents unique technical challenges that require specialized solutions. Some of the challenges include converting EBCDIC data to ASCII, where the handling of data types is unique to the mainframe, such as binary data and COMP data. Data stored in Virtual Storage Access Method (VSAM) files can be quite complex due to practices to store multiple different record types in a single file. To address these challenges, Precisely—a global leader in data integrity, serving over 12,000 customers—has partnered with Amazon Web Services (AWS) to enable real-time synchronization between mainframe systems and Amazon Relational Database Service (Amazon RDS). For more on this collaboration, check out our previous blog post: Unlock Mainframe Data with Precisely Connect and Amazon Aurora.

In this post, we introduce an alternative architecture to synchronize mainframe data to the cloud using Amazon Managed Streaming for Apache Kafka (Amazon MSK) for greater flexibility and scalability. This event-driven approach provides additional possibilities for mainframe data integration and modernization strategies.

A key enhancement in this solution is the use of the AWS Mainframe Modernization – Data Replication for IBM z/OS Amazon Machine Image (AMI) available in AWS Marketplace, which simplifies deployment and reduces implementation time.

Real-time processing and event-driven architecture benefits

Real-time processing makes data actionable within seconds rather than waiting for batch processing cycles. For example, financial institutions such as Global Payments have leveraged this solution to modernize mission-critical banking operations, including payments processing. By migrating these operations to the AWS Cloud, they enhanced user experience, improved scalability and maintainability, while enabling advanced fraud detection – all without impacting the performance of existing mainframe systems. Change data capture (CDC) enables this by identifying database changes and delivering them in real time to cloud environments.

CDC offers two key advantages for mainframe modernization:

Incremental data movement – Eliminates disruptive bulk extracts by streaming only changed data to cloud targets, minimizing system impact and ensuring data currency
Real-time synchronization – Keeps cloud applications in sync with mainframe systems, enabling immediate insights and responsive operations

Solution overview

In this post, we provide a detailed implementation guide for streaming mainframe data changes from DB2z through AWS Mainframe Modernization – Data Replication for IBM z/OS AMI to Amazon MSK and then applying those changes to Amazon Relational Database Service (Amazon RDS) for PostgreSQL using MSK Connect with the Confluent JDBC Sink Connector.

By introducing Amazon MSK into architecture and streamlining deployment through the AWS Marketplace AMI, we create new possibilities for data distribution, transformation, and consumption that expand upon our previously demonstrated direct replication approach. This streaming-based architecture offers several additional benefits:

Simplified deployment – Accelerate implementation using the preconfigured AWS Marketplace AMI
Decoupled systems – Separate the concern of data extraction from data consumption, allowing both sides to scale independently
Multi-consumer support – Enable multiple downstream applications and services to consume the same data stream according to their own requirements
Extensibility – Create a foundation that can be extended to support additional mainframe data sources such as IMS and VSAM, as well as additional AWS targets using MSK Connect sink connectors

The following diagram illustrates the solution architecture.

Capture/Publisher – Connect CDC Capture/Publisher captures Db2 changes from Db2 logs using IFI 306 Read and communicates captured data changes to a target engine through TCP/IP.
Controller Daemon – The Controller Daemon authenticates all connection requests, managing secure communication between the source and target environments.
Apply Engine – The Apply Engine is a multifaceted and multifunctional component in the target environment. It receives the changes from the Publisher agent and applies the changed data to the target Amazon MSK.
Connect CDC Single Message Transform (SMT) – Performs all necessary data filtering, transformation, and augmentation required by the sink connector.
JDBC Sink Connector – As data arrives, an instance of the JDBC Sink Connector along with Apache Kafka writes the data to target tables in Amazon RDS.

This architecture provides a clean separation between the data capture process and the data consumption process, allowing each to scale independently. The use of MSK as an intermediary enables multiple systems to consume the same data stream, opening possibilities for complex event processing, real-time analytics, and integration with other AWS services.

Prerequisites

To complete the solution, you need the following prerequisites:

Install AWS Mainframe Modernization – Data Replication for IBM z/OS
Have access to Db2z on mainframe from AWS using your approved connectivity between AWS and your mainframe

Solution walkthrough

The following code content shouldn’t be deployed to production environments without additional security testing.

Configure the AWS Mainframe Modernization Data Replication with Precisely AMI on Amazon EC2

Follow the steps defined at Precisely AWS Mainframe Modernization Data Replication. Upon the initial launch of the AMI, use the following command to connect to the Amazon Elastic Compute Cloud (Amazon EC2) instance:

ssh -i ami-ec2-user.pem ec2-user@$AWS_AMI_HOST

Configure the serverless cluster

To create an Amazon Aurora PostgreSQL-Compatible Edition Serverless v2 cluster, complete the following steps:

Create a DB cluster by using the following AWS Command Line Interface (AWS CLI) command. Replace the placeholder strings with values that correspond to your cluster’s subnet and subnet group IDs.

aws rds create-db-cluster \
   --db-cluster-identifier cdc-serverless-pg-cluster \
   --engine aurora-postgresql \
   --serverless-v2-scaling-configuration MinCapacity=1,MaxCapacity=2 \
   --master-username connectcdcuser \
   --manage-master-user-password \
   --db-subnet-group-name "<subnet-security-group-id>" \
   --vpc-security-group-ids "<cluster-security-group-id>"

Verify the status of the cluster by using the following command:

aws rds describe-db-clusters --db-cluster-identifier cdc-serverless-pg-cluster

Add a writer DB instance to the Aurora cluster:

aws rds create-db-instance \
   --db-cluster-identifier cdc-serverless-pg-cluster \
   --db-instance-identifier cdc-serverless-pg-instance \
   --db-instance-class db.serverless \
   --engine aurora-postgresql

Verify the status of the writer instance:

aws rds describe-db-instances --db-instance-identifier cdc-serverless-pg-instance

Create a database in the PostgreSQL cluster

After your Aurora Serverless v2 cluster is running, you need to create a database for your replicated mainframe data. Follow these steps:

Install the psql client:
```
sudo yum install postgresql16
```

Retrieve the password from secret manager:

aws secretsmanager get-secret-value --secret-id '<cdc-serverless-pg-cluster-secret ARN>' --query 'SecretString' --output text

Create a new database in PostgreSQL:

PGPASSWORD="password" psql --host=<DATABASE-HOST> --username=connectcdcuser --dbname=postgres -c "CREATE DATABASE dbcdc"

Configure the serverless MSK cluster

To create a serverless MSK cluster, complete the following steps:

Copy the following JSON and paste it into a new file create-msk-serverless-cluster.json. Replace the placeholder strings with values that correspond to your cluster’s subnet and security group IDs.

   {
     "VpcConfigs": [
       {
         "subnets": [
           "<cluster-subnet-1>",
           "<cluster-subnet-2>",
           "<cluster-subnet-3>"
         ],
         "securityGroups": ["<cluster-security-group-id>"]
       }
     ],
     "ClientAuthentication": {
       "Sasl": {
         "Iam": {
           "Enabled": true
         }
       }
     }
   }

Invoke the following AWS CLI command in the folder where you saved the JSON file in the previous step:

aws kafka create-cluster-v2 --cluster-name pgsqlmsk --serverless file://create-msk-serverless-cluster.json

Verify cluster status by invoking the following AWS CLI command:
```
aws kafka list-clusters-v2 --cluster-type-filter SERVERLESS
```

Get the bootstrap broker address by invoking the following AWS CLI command:

aws kafka get-bootstrap-brokers --cluster-arn "<msk-serverless-cluster-arn>"

Define the environment variable to store the bootstrap servers of the MSK cluster and locally install Kafka in the path environment variable:
```
export BOOTSTRAP_SERVERS=<kafka_bootstrap_servers_with_ports>
```

Create a topic on the MSK cluster

To create a Kafka topic, you need to install the Kafka CLI first. Follow these steps:

Download the binary distribution of Apache Kafka and extract the archive in folder kafka:

wget https://dlcdn.apache.org/kafka/3.9.0/kafka_2.13-3.9.0.tgz
   tar -xzf kafka_2.13-3.9.0.tgz
   ln -sfn kafka_2.13-3.9.0 kafka

To use IAM to authenticate with the MSK cluster, download the Amazon MSK Library for IAM and copy to the local Kafka library directory as shown in the following code. For complete instructions, refer to Configure clients for IAM access control.
```
wget https://github.com/aws/aws-msk-iam-auth/releases/download/v2.3.1/aws-msk-iam-auth-2.3.1-all.jar
cp aws-msk-iam-auth-2.3.1-all.jar kafka/libs
```

In the directory, create a file to configure a Kafka client to use IAM authentication for the Kafka console producer and consumers:

security.protocol=SASL_SSL
   sasl.mechanism=AWS_MSK_IAM
   sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required; sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler

Create the Kafka topic, which you defined in the connector config:

kafka/bin/kafka-topics.sh --create --bootstrap-server $BOOTSTRAP_SERVERS --command-config kafka/config/client-config.properties --partitions 1 --topic pgsql-sink-topic

Configure the MSK Connect plugin

Next, create a custom plugin available in the AMI at /opt/precisely/di/packages/sqdata-msk_connect_1.0.1.zip which contains the following:

JDBC Sink Connector from Confluent
MSK Config provider
AWS Mainframe Modernization – Data Repication for IBM z/OS Custom SMT

Follow these steps:

Invoke the following to upload the .zip file to an S3 bucket to which you have access:
```
aws s3 cp /opt/precisely/di/packages/sqdata-msk_connect_1.0.1.zip s3://<bucket>/
```

Copy the following JSON and paste it into a new file create-custom-plugin.json. Replace the placeholder strings with values that correspond to your bucket.

{
     "contentType": "ZIP",
     "description": "jdbc sink connector",
     "location": {
       "s3Location": {
         "bucketArn": "arn:aws:s3:::<bucket>",
         "fileKey": "sqdata-msk_connect_1.0.1.zip"
       }
     },
     "name": "jdbc-sink-connector"
   }

Invoke the following AWS CLI command in the folder where you saved the JSON file in the previous step:
```
aws kafkaconnect create-custom-plugin --cli-input-json file://create-custom-plugin.json
```
Verify plugin status by invoking the following AWS CLI command:
```
aws kafkaconnect list-custom-plugins
```

Configure the JDBC Sink Connector

To configure the JDBC Sink Connector, follow these steps:

Copy the following JSON and paste it into a new file create-connector.json. Replace the placeholder strings with appropriate values:

{
     "connectorConfiguration": {
       "connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
       "connection.url": "jdbc:postgresql://<postgresql-endpoint>
/dbcdc?currentSchema=public",
       "config.providers": "secretsmanager",
       "config.providers.secretsmanager.class": "com.amazonaws.kafka.config.providers.SecretsManagerConfigProvider",
       "connection.user": "${secretsmanager:MySecret-1234:username}",
       "connection.password": "${secretsmanager:MySecret-1234:password}",
       "config.providers.secretsmanager.param.region": "<region>",
       "tasks.max": "1",
       "topics": "pgsql-sink-topic",
       "insert.mode": "upsert",
       "delete.enabled": "true",
       "pk.mode": "record_key",
       "auto.evolve": "true",
       "auto.create": "true",
       "value.converter": "org.apache.kafka.connect.storage.StringConverter",
       "key.converter": "org.apache.kafka.connect.storage.StringConverter",
       "transforms": "ConnectCDCConverter",
       "transforms.ConnectCDCConverter.type": "com.precisely.kafkaconnect.ConnectCDCConverter",
       "transforms.ConnectCDCConverter.cdc.multiple.tables.enabled": "true",
       "transforms.ConnectCDCConverter.cdc.source.table.name.ignore.schema": "true"
     },
     "connectorName": "pssql-sink-connector",
     "kafkaCluster": {
       "apacheKafkaCluster": {
         "bootstrapServers": "<msk-bootstrap-servers-string>",
         "vpc": {
           "subnets": [
             "<cluster-subnet-1>",
             "<cluster-subnet-2>",
             "<cluster-subnet-3>"
           ],
           "securityGroups": ["<cluster-security-group-id>"]
         }
       }
     },
     "capacity": {
       "provisionedCapacity": {
         "mcuCount": 1,
         "workerCount": 1
       }
     },
     "kafkaConnectVersion": "3.7.x",
     "serviceExecutionRoleArn": "<arn-of-a-role-that-msk-connect-can-assume>",
     "plugins": [
       {
         "customPlugin": {
           "customPluginArn": "<arn-of-custom-plugin-that-contains-connector-code>",
           "revision": 1
         }
       }
     ],
     "kafkaClusterEncryptionInTransit": {"encryptionType": "TLS"},
     "kafkaClusterClientAuthentication": {"authenticationType": "IAM"},
     "logDelivery": {
       "workerLogDelivery": {
         "cloudWatchLogs": {
           "enabled": true,
           "logGroup": "<loggroup>"
         }
       }
     }
   }

Invoke the following AWS CLI command in the folder where you saved the JSON file in the previous step:
```
aws kafkaconnect create-connector --cli-input-json file://create-connector.json
```
Verify connector status by invoking the following AWS CLI command:
```
aws kafkaconnect list-connectors
```

Set up Db2 Capture/Publisher on Mainframe

To establish the Db2 Capture/Publisher on the mainframe for capturing changes to the DEPT table, follow these structured steps that build upon our previous blog post, Unlock Mainframe Data with Precisely Connect and Amazon Aurora:

Prepare the source table. Before configuring the Capture/Publisher, ensure the DEPT source table exists on your mainframe Db2 system. The table definition should match the structure defined at \$SQDATA_VAR_DIR/templates/dept.ddl. If you need to create this table on your mainframe, use the DDL from this file as a reference to ensure compatibility with the replication process.
Access the Interactive System Productivity Facility (ISPF) interface. Sign in to your mainframe system and access the AWS Mainframe Modernization – Data Repication for IBM z/OS ISPF panels through the supplied ISPF application menu. Select option 3 (CDC) to access the CDC configuration panels, as demonstrated in our previous blog post.
Add source tables for capture:
1. From the CDC Primary Option Menu, choose option 2 (Define Subscriptions).
2. Choose option 1 (Define Db2 Tables) to add source tables.
3. On the (Add DB2 Source Table to CAB File panel), enter a wildcard value (%) or the specific table name DEPT in the (Table Name) field.
4. Press Enter to display the list of available tables.
5. Type S next to the DEPT table to select it for replication, then press Enter to confirm.

This process is like the table selection process shown in figure 3 and figure 4 of our previous post but now focuses specifically on the DEPT table structure.

With the completion of both the Db2 Capture/Publisher setup on the mainframe and the AWS environment configuration (Amazon MSK, Apply Engine, and MSK Connect JDBC Sink Connector), you now have a fully functional pipeline ready to capture data changes from the mainframe and stream them to the MSK topic. Inserts, updates, or deletions to the DEPT table on the mainframe will be automatically captured and pushed to the MSK topic in near real time. From there, the MSK Connect JDBC Sink Connector and the custom SMT will process these messages and apply the changes to the PostgreSQL database on Amazon RDS, completing the end-to-end replication flow.

Configure Apply Engine for Amazon MSK integration

Configure the AWS side components to receive data from the mainframe and forward it to Amazon MSK. Follow these steps to define and manage a new CDC pipeline from DB2 z/OS to Amazon MSK:

Use the following command to switch to the connect user:
```
sudo su connect
```

Create the apply engine directories:

mkdir -p \$SQDATA_VAR_DIR/apply/DB2ZTOMSK/ddl
     connect> mkdir -p \$SQDATA_VAR_DIR/apply/DB2ZTOMSK/scripts

Copy the sample script from dept.ddl:

cp \$SQDATA_VAR_DIR/templates/dept.ddl \$SQDATA_VAR_DIR/apply/DB2ZTOMSK/ddl/

Copy the following content and paste it in a new file $SQDATA_VAR_DIR/apply/DB2ZTOMSK/scripts/DB2ZTOMSK.sqd. Replace the placeholder strings with values that correspond to the DB2z endpoint:

-----------------------------------------------------------------------
   Name: DB2TOKAF: Z/OS DB2 To Kafka
   -----------------------------------------------------------------------
   SUBSTITUTION PARMS USED IN THIS SCRIPT:
   ---------------------------------------------------------------------
   JOBNAME DB2TOKAFKA;
   -----------------------------
   TABLE DESCRIPTIONS
   ---------------------------
   BEGIN GROUP SOURCE_TABLES;
   DESCRIPTION Db2SQL /var/precisely/di/sqdata/apply/DB2ZTOMSK/ddl/dept.ddl AS DEPT KEY IS DEPTNO;
   END GROUP;
   -------------------------------------------------------------
   DATASTORE SECTION
   -------------------------------------------------------------
   SOURCE DATASTORE
   DATASTORE cdc://<DB2z endpoint with port>/dbcg/DBCG_TBTSS388T6 OF UTSCDC AS CDCIN DESCRIBED BY GROUP SOURCE_TABLES;
   -- TARGET DATASTORE
   DATASTORE kafka:///pgsql-sink-topic/table_key OF JSON AS TARGET KEY IS DEPTNO DESCRIBED BY GROUP SOURCE_TABLES;
   ---------------------------------
   PROCESS INTO TARGET
   SELECT { REPLICATE(TARGET) } FROM CDCIN;

Create the working directory:

mkdir -p /var/precisely/di/sqdata_logs/apply/DB2ZTOMSK

Add the following to $SQDATA_DAEMON_DIR/cfg/sqdagents.cfg:

[DB2ZTOMSK]
   type=engine
   program=sqdata
   args=/var/precisely/di/sqdata/apply/DB2ZTOMSK/scripts/DB2ZTOMSK.prc --log-level=8
   working_directory=/var/precisely/di/sqdata_logs/apply/DB2ZTOMSK
   stdout_file=stdout.txt
   stderr_file=stderr.txt
   auto_start=0
   comment=Apply Engine for MSK from Db2z

After the preceding code is added to the sqdagents.cfg section, reload for the changes to take effect:
```
sqdmon reload
```

Validate the apply engine job script by using the SQData parse command to create the compiled file expected by the SQData engine:

sqdparse $SQDATA_VAR_DIR/apply/DB2ZTOMSK/scripts/DB2ZTOMSK.sqd $SQDATA_VAR_DIR/apply/DB2ZTOMSK/scripts/DB2ZTOMSK.prc

The following is an example of the output that you get when you invoke the command successfully:

SQDC042I mounting/running sqdparse with arguments:
SQDC041I args[0]:sqdparse
SQDC041I args[1]:/var/precisely/di/sqdata/apply/DB2ZTOMSK/scripts/DB2ZTOMSK.sqd
SQDC041I args[2]:/var/precisely/di/sqdata/apply/DB2ZTOMSK/scripts/DB2ZTOMSK.prc
SQDC000I *******************************************************
SQDC021I sqdparse Version 5.0.1-rel (Linux-x86_64)
SQDC022I Build-id 4f2d7c16728aa2e40c610db7d5a6e373476a9889
SQDC023I (c) 2001, 2025 Syncsort Incorporated. All rights reserved.
SQDC000I *******************************************************
SQDC000I
SQD0000I 2025-03-31 00:59:10
>>> Start Preprocessed /var/precisely/di/sqdata/apply/DB2ZTOMSK/scripts/DB2ZTOMSK.sqd
000001 ----------------------------------------------------------------------
000002 -- Name: DB2TOKAF:  Z/OS DB2 To Kafka
000003 ----------------------------------------------------------------------
000004 --  SUBSTITUTION PARMS USED IN THIS SCRIPT:
000005 ----------------------------------------------------------------------
000006
000007 JOBNAME DB2TOKAFKA;
000008
000009 ----------------------------
000010 -- TABLE DESCRIPTIONS
000011 ----------------------------
000012 BEGIN GROUP SOURCE_TABLES;
000013 DESCRIPTION Db2SQL /var/precisely/di/sqdata/apply/DB2ZTOMSK/ddl/dept.ddl  AS DEPT
000014 KEY IS DEPTNO;
000015 END GROUP;
000016
000017 ------------------------------------------------------------
000018 --       DATASTORE SECTION
000019 ------------------------------------------------------------
000020
000021 -- SOURCE DATASTORE
000022 DATASTORE /var/precisely/di/sqdata/apply/DB2ZTOMSK/scripts/DB0A.ENGINE3.DEPT.COPY
000023           OF UTSCDC
000024           AS CDCIN
000025           DESCRIBED BY GROUP SOURCE_TABLES;
000026
000027 -- TARGET DATASTORE
000028 DATASTORE 
000029           OF JSON
000030           AS TARGET
000031           KEY IS DEPTNO
000032           DESCRIBED BY GROUP SOURCE_TABLES;
000033
000034 ----------------------------------
000035
000036 PROCESS INTO TARGET
000037 SELECT
000038 {
000039     REPLICATE(TARGET)
000040 }
000041 FROM CDCIN;
<<< End Preprocessed /var/precisely/di/sqdata/apply/DB2ZTOMSK/scripts/DB2ZTOMSK.sqd
>>> Start Preprocessed /var/precisely/di/sqdata/apply/DB2ZTOMSK/ddl/dept.ddl
000001 CREATE TABLE DEPARTMENT
000002 (
000003    DEPTNO char(3) NOT NULL,
000004    DEPTNAME varchar(36) NOT NULL,
000005    MGRNO char(6),
000006    ADMRDEPT char(3) NOT NULL,
000007    LOCATION char(16),
000008    CONSTRAINT PK_DEPTNO PRIMARY KEY (DEPTNO)
000009 ) ;
<<< End Preprocessed /var/precisely/di/sqdata/apply/DB2ZTOMSK/ddl/dept.ddl
Number of Data Stores...................: 2
Data Store..............................: /var/precisely/di/sqdata/apply/DB2ZTOMSK/scripts/DB0A.ENGINE3.DEPT.COPY
  Alias.................................: CDCIN
  Type..................................: UTS Change Data Capture
  Number of Records.....................: 1
    Record Name.........................: DEPARTMENT
    Record Description Alias............: DEPT
    Record Description Length...........: 72
    Number of Fields....................: 5
      ................................... TYPE            OFF   LEN   XLEN  EXT
      ................................... ---------- ----- ----- ----- -----
      DEPTNO............................: CHAR(3)             0     3     3
      DEPTNAME..........................: VARCHAR(36)         3    38    38
      MGRNO.............................: CHAR(6)             7     6     6
      ADMRDEPT..........................: CHAR(3)            14     3     3
      LOCATION..........................: CHAR(16)           17    16    16
Data Store..............................: 
  Alias.................................: TARGET
  Type..................................: JSON
  Number of Records.....................: 1
    Record Name.........................: DEPARTMENT
    Record Description Alias............: DEPT
    Record Description Length...........: 70
    Number of Fields....................: 5
      ................................... TYPE            OFF   LEN   XLEN  EXT
      ................................... ---------- ----- ----- ----- -----
      DEPTNO............................: CHAR(3)             0     3     3
      DEPTNAME..........................: VARCHAR(36)         3    38    38
      MGRNO.............................: CHAR(6)            41     6     6
      ADMRDEPT..........................: CHAR(3)            47     3     3
      LOCATION..........................: CHAR(16)           50    16    16
Section.................................: SQDSTP000
  Number of steps.......................: 1
SQDC017I sqdparse(pid=4023) terminated successfully

Copy the following content and paste it in a new file /var/precisely/di/sqdata_logs/apply/DB2ZTOMSK/sqdata_kafka_producer.conf. Replace the placeholder strings with values that correspond to your bootstrap server and AWS Region.

metadata.broker.list=<kafka_bootstrap_servers_with_ports>
     security.protocol=SASL_SSL
     sasl.mechanism=OAUTHBEARER
     sasl.oauthbearer.config="extension_AWSMSKCB=python3,/usr/lib64/python3.9/site-packages/aws_msk_iam_sasl_signer/cli.py,--region,<region>"
     sasl.oauthbearer.method="default"

Start the apply engine using the controller daemon by using the following command:
```
sqdmon start ///DB2ZTOMSK
```

Monitor the apply engine through the controller daemon by using the following command:

sqdmon display ///DB2ZTOMSK --format=details

The following is an example of the output that you get when you invoke the command successfully:

Engine..................................: DB2ZTOMSK
version.................................: 5.0.1-rel (Linux-x86_64)
git.....................................: f021c29a84c1a99f59144288aeeb2cb8fa494485
jobname.................................: DB2TOKAFKA
parsed..................................: 20250320172610278108
started.................................: 2025-03-20.17.47.23.444474
started (UTC)...........................: 2025-03-20.17.47.23.444474 (1742492843444)
updated (UTC)...........................: 2025-03-20.17.47.25.901018 (1742492845901)
Input Datastore.........................: /var/precisely/di/sqdata/apply/DB2ZTOMSK/scripts/DB0A.ENGINE3.DEPT.COPY
Alias...................................: CDCIN
Type....................................: UTS Change Data Capture
  Records Read..........................: 14
  Records Selected......................: 14
  Bytes Read............................: 2892
Output Datastore........................: kafka:///pgsql-sink-topic/table_key
Alias...................................: TARGET
Type....................................: JSON
  Records Inserted......................: 14
  Records Updated.......................: 0
  Records Deleted.......................: 0
  Formatted bytes.......................: 3458
  Unformatted bytes.....................: 448
Total Output Formatted bytes............: 3458
Total Output Unformatted bytes..........: 448
SQDC017I sqdmon(pid=123540) terminated successfully

Logs can also be found at /var/precisely/di/sqdata_logs/apply/DB2ZTOMSK.

Verify data in the MSK topic

Invoke the Kafka CLI command to verify the JSON data in the MSK topic:

kafka/bin/kafka-console-consumer.sh --bootstrap-server $BOOTSTRAP_SERVERS --consumer.config kafka/config/client-config.properties --topic pgsql-sink-topic --from-beginning --property print.key=true

Verify data in the PostgreSQL database

Invoke the following command to verify the data in the PostgreSQL database:

PGPASSWORD="password" psql --host=<DATABASE-HOST> --username=<user> --dbname=<database> -c "select * from \"DEPT\""

With these steps completed, you’ve successfully set up end-to-end data replication from DB2z to RDS for PostgreSQL, using AWS Mainframe Modernization – Data Replication for IBM z/OS AMI, Amazon MSK, MSK Connect, and the Confluent JDBC Sink Connector.

Cleanup

When you’re finished testing this solution, you can clean up the resources to avoid incurring additional charges. Follow these steps in sequence to ensure proper cleanup.

Step 1: Delete the MSK Connect components

Follow these steps:

List existing connectors:
```
aws kafkaconnect list-connectors
```

Delete the sink connector:

aws kafkaconnect delete-connector --connector-arn "<arn-of-connector>"

List custom plugins:
```
aws kafkaconnect list-custom-plugins
```

Delete the custom plugin:

aws kafkaconnect delete-custom-plugin --custom-plugin-arn "<arn-of-custom-plugin>"

Step 2: Delete the MSK cluster

Follow these steps:

List MSK clusters:

aws kafka list-clusters-v2 --cluster-type-filter SERVERLESS

Delete the MSK serverless cluster:

aws kafka delete-cluster --cluster-arn "<arn-of-msk-serverless-cluster>"

Step 3: Delete the Aurora resources

Follow these steps:

Delete the Aurora DB instance:

aws rds delete-db-instance --db-instance-identifier cdc-serverless-pg-instance --skip-final-snapshot

Delete the Aurora DB cluster:

aws rds delete-db-cluster --db-cluster-identifier cdc-serverless-pg-cluster --skip-final-snapshot.

Conclusion

By capturing changed data from DB2z and streaming it to AWS targets, organizations can modernize their legacy mainframe data stores, enabling operational insights and AI initiatives. Businesses can use this solution to take advantage of cloud-based applications with mainframe data to provide scalability, cost-efficiency, and enhanced performance.

The integration of AWS Mainframe Modernization – Data Replication for IBM z/OS AMI with Amazon MSK and RDS for PostgreSQL provides an enhanced framework for real-time data synchronization that maintains data integrity. This architecture can be extended to support additional mainframe data sources such as VSAM and IMS, as well as other AWS targets. Organizations can then tailor their data integration strategy to specific business needs. Data consistency and latency challenges can be effectively managed through AWS and Precisely’s monitoring capabilities. By adopting this architecture, organizations keep their mainframe data continually available for analytics, machine learning (ML), and other advanced applications.Streaming mainframe data to AWS in near real time represents a strategic step toward modernizing legacy systems while unlocking new opportunities for innovation, with data transfers occurring in subseconds. With Precisely and AWS, organizations can effectively navigate their modernization journey and maintain their competitive advantage.

Learn more about AWS Mainframe Modernization – Data Replication for IBM z/OS AMI in the Precisely documentation. AWS Mainframe Modernization Data Replication is available for purchase in AWS Marketplace. For more information about the solution or to see a demonstration, contact Precisely.

AWS Big Data Blog

Stream mainframe data to AWS in near real time with Precisely and Amazon MSK

Real-time processing and event-driven architecture benefits

Solution overview

Prerequisites

Solution walkthrough

Configure the AWS Mainframe Modernization Data Replication with Precisely AMI on Amazon EC2

Configure the serverless cluster

Create a database in the PostgreSQL cluster

Configure the serverless MSK cluster

Create a topic on the MSK cluster

Configure the MSK Connect plugin

Configure the JDBC Sink Connector

Set up Db2 Capture/Publisher on Mainframe

Configure Apply Engine for Amazon MSK integration

Verify data in the MSK topic

Verify data in the PostgreSQL database

Cleanup

Step 1: Delete the MSK Connect components

Step 2: Delete the MSK cluster

Step 3: Delete the Aurora resources

Conclusion

About the authors

Resources

Follow

Learn

Resources

Developers

Help