Category: DMS


Categorizing and Prioritizing a Large-Scale Move to an Open Source Database

Wendy Neu is a big data architect at Amazon Web Services.

The AWS Schema Conversion Tool (AWS SCT) and AWS Database Migration Service (AWS DMS) are tools that help facilitate and simplify migrating your commercial database to a variety of engines on Amazon RDS. AWS SCT specifically helps simplify proprietary database migrations to open source. It does this by automatically converting your source database schema and a majority of the custom code to a format that is compatible with your new target database. The custom code that the tool converts includes views, stored procedures, and functions. Any code that the tool cannot convert automatically is clearly marked so that you can convert it yourself. AWS DMS helps you migrate your data easily and securely with minimal downtime.

Changing database engines can be daunting. However, the value proposition of a highly scalable, cost-effective, fully managed service such as Amazon Aurora can make the challenge worth it. Assessing and planning for a single database migration to open source is straightforward with AWS SCT and AWS DMS. You can generate an AWS SCT assessment report and get answers about how easy it is to use these tools to migrate your databases.

The following are just a few of the articles and videos about making the transition to open source for your databases:

What if you have hundreds or even thousands of databases?
Building an assessment report and planning a move for one, two, or even ten databases to open source is a straightforward process. You probably have enough familiarity with applications that attach to those databases that you can identify which ones to move first. What do you do if you have hundreds or thousands of database instances deployed in and around your organization that you don’t have intimate application knowledge about?

If you’re in a centralized IT department, you might know how many databases you manage, have backups scheduled for them, and be tracking them in your inventory. But you might not have enough detail to plan a large-scale move or prioritize them for analysis. This post talks about how to assess your database portfolio and break it down into a manageable pipeline of migrations to make the process go smoothly.

A collaboration of smaller projects
A detailed project plan for a single heterogeneous database migration consists of 12 steps, as detailed in the following table.

Phase Description Phase Description
1 Assessment 7 Functional testing of the entire system
2 Database schema conversion 8 Performance tuning
3 Application conversion/remediation 9 Integration and deployment
4 Scripts conversion 10 Training and knowledge
5 Integration with third-party applications 11 Documentation and version control
6 Data migration 12 Post-production support

At a high level, and for the purposes of this blog, the process for planning a migration project can be grouped into five areas of work: Analysis and Planning, Schema Migration, Data Migration, Application Migration, and Cutover. How long your migration takes generally depends on how long you spend iterating over the migration and testing phases. If you are planning several migrations in tandem, you should understand when you start which ones are likely to take the most time and which ones can be tackled first or in rapid succession.

SCT Workflow

Prioritizing which databases to analyze first when moving to open source often hinges on how much proprietary database code is being used in the databases. The volume of proprietary code in use affects the amount of time it takes to translate it and test it during the migration phases of your project. You can categorize your databases by workload using dictionary queries after you establish the categories and criteria.

(more…)

Integrating Teradata with Amazon Redshift Using the AWS Schema Conversion Tool

David Gardner is a solutions architect and Pratim Das is a specialist solutions architect for Analytics at Amazon Web Services.

Teradata provides long-standing data warehouse solutions, with many customers and applications running on its platforms. As companies migrate to the cloud, they are using Amazon Redshift as part of their cloud adoption. Recently AWS announced support for Teradata as a source for the AWS Schema Conversion Tool (AWS SCT). With this capability, you can easily integrate Teradata schemas into an Amazon Redshift model. Once both source and target schemas are in place, you can use AWS SCT to set up agents that collect and migrate data to the target Amazon Redshift schema. In this post, we provide an example of how to integrate these two technologies easily and securely.

Introduction to the use case
At a high level, the architecture looks like the following diagram:

Figure 1: Teradata to Amazon Redshift migration using AWS SCT agents

(more…)

Database Migration Gaining Momentum

Ilia Gilderman is a senior software development manager at Amazon Web Services.

Only five months have passed since our most recent post, in March 2017, about the momentum of AWS Database Migration Service (AWS DMS). So far, we’ve watched our customers migrating over 35,000 unique databases to AWS using AWS DMS.

Here is a quick recap for anyone who is reading about AWS DMS for the first time.

AWS Database Migration Service helps anyone migrate databases to AWS easily and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. AWS DMS can migrate your data to and from most widely used commercial and open-source databases and data warehouses.

In addition to AWS DMS, we developed the AWS Schema Conversion Tool (AWS SCT). AWS SCT makes heterogeneous database migrations easy by automatically converting the source database schema and a majority of the custom code—including views, stored procedures, and functions—to a format that is compatible with the target database. Any code that can’t be automatically converted is clearly marked so that it can be manually converted.

AWS SCT can also scan your application source code for embedded SQL statements and convert them as part of a database schema conversion project. During this process, AWS SCT performs cloud-native code optimization by converting legacy Oracle and SQL Server functions to their equivalent AWS service, thus helping you modernize the applications during the migration. AWS SCT can also help convert schema for a range of data warehouses into Amazon Redshift. When schema conversion is complete, you can migrate the data to Amazon Redshift using built-in data migration agents.

Expanded migration capabilities
We also significantly expanded our data warehouse migration capabilities and now support schema conversion and data extraction for six leading commercial platforms: Teradata, IBM Netezza, Microsoft SQL Server, Oracle, Micro Focus Vertica, and Greenplum. You now can move workloads from these systems into Amazon Redshift using AWS SCT and AWS SCT migration agents. For more details, see Working with Data Warehouses.

Since the March post, we’ve expanded AWS DMS migration and replication capabilities significantly by adding NoSQL source and targets, so you can migrate your databases to Amazon DynamoDB (among other options).

These new NoSQL connectors and migration agents provide a variety of interesting use cases that you can now implement. Most notably:

(more…)

Strategy for Migrating Partitioned Tables from Oracle to Amazon RDS for PostgreSQL and Amazon Aurora with PostgreSQL Compatibility

Sandeep Kariro is a big data architect and Akm Raziul is a database architect at Amazon Web Services.

Enterprises have been using closed-source database systems for more than four decades. Managing data with these systems has been a critical part of running a successful business in every organization. You have to make a considerable investment up front to obtain the required hardware and enterprise licenses before you can even start building the system. With cloud computing gaining momentum in recent years and technology maturing in the open source system world, interest has been growing in moving critical traditional online transaction processing (OLTP) database workloads to open-source systems such as PostgreSQL, MySQL, and others.

Migrating one database system to another requires significant effort and planning. Even though the basics around the database system generally stay the same, implementation and operational activity on each of them can vary quite a bit.

The AWS Schema Conversion Tool (AWS SCT) helps migrate Oracle schemas and custom stored procedure code to a PostgreSQL-compatible target with minimal effort. AWS SCT generates a migration assessment report that clearly identifies objects and code that require manual conversion or rework. One critical need for high volume OLTP applications is to have the data partitioned for performance reasons. During the initial assessment, AWS SCT highlights the Oracle partitions that require manual intervention. This is due to the way current versions of PostgreSQL (as of 9.6.2) handle table partitions. PostgreSQL manages partitioned tables differently from Oracle, SQL Server, or MySQL database partitions.

In this post, we discuss how to convert Oracle Database native range and list partitions to PostgreSQL native range and list partitions.

Range and list partition definitions in Oracle
Range: Each partition holds a range of values based on the partition key. For example, you can partition data based on a date field. Each partition can hold data for a date range such as a week, a month, a quarter, etc. The first partition of the table is always open-ended toward lower values. You can optionally set the last partition to be open-ended also by defining the maximum partition value as MAXVALUE. A range partition can have up to 16 columns.

The following is an example of a table created with a range partition on create date. The table is created with three partitions. The first partition holds all data created with a create date earlier than 31-DEC-2016. The second partition holds all data created with a create date between 01-JAN-2017 and 31-DEC-2017. The third partition holds all other data created with a create date after 31-DEC-2017.

CREATE TABLE person 
	(id NUMBER(10) NOT NULL
	,name VARCHAR2(50)
	,gender VARCHAR2(1)
	,create_dt DATE DEFAULT sysdate
	)
PARTITION BY RANGE(create_dt) (
  partition p1 values less than ('01-JAN-2017'), 
  partition p2 values less than ('01-JAN-2018'), 
  partition p3 values less than (MAXVALUE)
);

(more…)

Oracle Migration Automated – How to Automate Generation of Selection and Transformation Rules for AWS Data Migration Service

Akm Raziul Islam is a database architect at Amazon Web Services.

You can use AWS Data Migration Service (AWS DMS) to copy data from multiple schemas in an Oracle source to a single Amazon RDS for Oracle target. You can also migrate data to different schemas on a target. But to do this, you need to use schema transformation rules on the AWS DMS table mappings.

If you are migrating multiple Oracle schemas, it can be challenging to manually create and track multiple selection and transformation rules. This post describes how you can use simple dynamic SQL in Oracle to generate custom JSON selection and transformation rules for all your user schemas.

Creating selection and transformation JSON rules
For this example, an Oracle target endpoint uses the RDSMASTER user, and the source endpoint uses DMS_USER, so you must create a transformation rule as follows. The rule ensures that the DMS_USER schema is targeted and the RDSMASTER schema is not created.

{
"rule-type": "selection",   
"rule-id": "1",         
"rule-name": "1",      
"object-locator": {     
 "schema-name": "DMS_USER",   
"table-name": "%"      
  },         
 "rule-action": "include"
},
  {
   "rule-type": "transformation",
   "rule-id": "2",
   "rule-name": "2",
   "rule-action": "rename",
   "rule-target": "schema",
   "object-locator": {
   "schema-name": "DMS_USER"
},
"value": "DMS_USER"
}

You can do the same thing in the AWS DMS console:

If the transformation rules are not created for the same schema, then by default, DMS_USER from the source schema and all other selected schemas for migration are copied under the target endpoint user RDSMASTER. In AWS DMS, you can create those selection and transformation rules using the console/CLI and JSON. But it can be challenging to create and track multiple selection and transformation rules for multiple Oracle schema migrations.

For example, if you have to migrate roughly 300+ Oracle schemas from each database, there would be about 300+ schemas (300 selection + 300 transformation rules). It’s pretty challenging to input every single schema for each JSON selection and transformation rule. A better approach is to use simple dynamic SQL in Oracle that generates JSON selection and transformation rules for all the user schemas.

Generating JSON rules for multiple schemas
Here is the dynamic SQL that you can run on the Oracle source to generate the JSON selection and transformation rules:

drop sequence my_number1;
drop sequence my_number2;
create sequence my_number1 start with 1 increment by 2;
create sequence my_number2 start with 2 increment by 2 ;
select '
        {
            "rule-type": "selection",
            "rule-id": "'||my_number1.nextval||'",
            "rule-name": "'||my_number1.currval||'",
            "object-locator": {
                "schema-name": "'||username||'",
                "table-name": "%"
            },
            "rule-action": "include"
        },
        {
            "rule-type": "transformation",
            "rule-id": "'||my_number2.nextval||'",
            "rule-name": "'||my_number2.currval||'",
            "rule-action": "rename",
            "rule-target": "schema",
            "object-locator": {
                "schema-name": "'||username||'"
            },
            "value": "'||username||'"
        },' FROM DBA_USERS where default_tablespace = 'USERS' and ORACLE_MAINTAINED ='N' ;

A few things to note about this code: I ran the code using SQL Developer connected to an Oracle 12c instance. Oracle 12c comes with an ORACLE_MAINTAINED column that isn’t available in other versions. This column makes sure that you consider only non-SYSTEM Schemas. If the source Oracle version is earlier than 12x, simply skip the following part of the code:

and ORACLE_MAINTAINED=’N’;

And replace it with something like this:

FROM DBA_USERS where default_tablespace='USERS' and USERNAME NOT in ('DIP','AUDSYS','GSMUSER','XS$NULL','GSMCATUSER');

You can add more schemas in the USERNAME field that you don’t want to be included in the JSON list.

To generate the transformation rules:

  1. Run the code from SQL Developer (press F5 or choose Run Script).
  2. Scroll down to the end of the line of the generated code. It should contain a comma (,) after the curly brace “}”. Go ahead and delete the comma and leave the curly brace.
  3. Copy the rest of the JSON code (starts from the curly brace before the first “rule-type” to the last curly brace in the output window), and paste it under the following code:
    {
        "rules": [
     
    ----- code goes here
    ]
    }
    
  4. Copy the code using any text editor (e.g., TextWrangler, Notepad++), and paste it in the AWS DMS task Table mappings on the JSON tab (making sure that Enable JSON editing is selected).
    After you copy the JSON code, if you choose the Guided tab, you can see that all the schemas are auto-populated for both selection and transformation rules.

Summary
You can use simple dynamic SQL in Oracle to generate AWS DMS custom selection and transformation rules for filtering and table mappings on the target. For more information about AWS DMS selection and transformation table mapping, see the AWS DMS documentation.

How to Migrate Your Oracle Database to PostgreSQL

Knievel Co is a database engineer at Amazon Web Services.

This blog post provides an overview on how you can migrate your Oracle database to PostgreSQL. The two biggest parts of a database migration are the schema conversion and the data replication. We go over how to use the AWS Schema Conversion Tool (AWS SCT) and the AWS Database Migration Service (AWS DMS) to tackle these two parts.

Before we cover SCT and DMS, you should take some preliminary steps that have been proven to be helpful for every migration. One way to make the migration easier is to have what is usually called a modernization phase before the migration. This phase involves taking an inventory of objects in your Oracle database and then making a few decisions.

First, deprecate any objects that are no longer needed. Don’t waste time migrating objects that no one cares about. Also, purge any historical data that you no longer need. You don’t want to waste time replicating data you don’t need, for example temporary tables and backup copies of tables from past maintenance. Second, move flat files and long strings stored in LOBs, CLOBs, LONGs, and so on into Amazon S3 or Amazon Dynamo DB. This process requires client software changes, but it reduces the complexity and size of the database and makes the overall system more efficient. Last, try to move any PL/SQL packages and procedures, especially those containing business logic, back to the client software. These objects need to be manually changed anyway if SCT cannot translate them.

The following steps are suggested for migrating to a different database engine (Oracle to PostgreSQL in this case). If you aren’t moving to a different platform, you might find more appropriate native tools or other techniques to move your database.

  1. Create your schema in the target database.
  2. Drop foreign keys and secondary indexes on the target database, and disable triggers.
  3. Set up a DMS task to replicate your data – full load and change data capture (CDC).
  4. Stop the task when the full load phase is complete, and recreate foreign keys and secondary indexes.
  5. Enable the DMS task.
  6. Migrate tools and software, and enable triggers.

Create your schema in the target database

Begin your migration by taking a look at the schema that you want to migrate. In our case, we use the AWS Schema Conversion Tool (AWS SCT) to perform the analysis. When you start the application, you need to create a new project, with the source being Oracle and the target being PostgreSQL. When you’re connected, select the name of the schema that you want to migrate on the left side. Right-click the schema name and choose Convert Schema. Then choose View / Assessment Report View.

The AWS SCT assessment report is a high-level overview of the effort required to convert your Oracle database to PostgreSQL. The following is an example of what the assessment report looks like:

(more…)

Using the AWS Database Migration Service, Amazon S3, and AWS Lambda for Database Analytics

Jeff Levine is a solutions architect for Amazon Web Services.

The AWS Database Migration Service (AWS DMS) supports Amazon S3 as a migration target. The services enable you to extract information from any database supported by DMS and write it to Amazon S3 in a format that can be used by almost any application. You can extract the entire database and replicate ongoing changes including additions, deletions, and updates using change data capture (CDC) technology. You can even process the changes with AWS Lambda or Amazon Kinesis Firehose. By using Lambda or Firehose, you can extend the capabilities of AWS services to existing database environments, both those within AWS and in other locations.

Overview of the example

Let’s consider an example that shows how to bring these services together. We use the AWS Database Migration service to migrate Twitter statistics produced by the City of Seattle, Washington from Amazon RDS for PostgreSQL to Amazon S3 and use AWS Lambda for analysis.  We see how DMS handles both the migration of the initial database contents and ongoing change data capture. Here’s a diagram of what we plan to do.

We will do the following:

  1. Create two datasets containing Twitter account statistics. One initializes a database table. The other one performs subsequent updates to that table, processed by the change data capture feature.
  2. Create an Amazon RDS for PostgreSQL database instance, define a table, and load the first dataset.
  3. Create an Amazon S3 bucket to serve as the target.
  4. Create an AWS Lambda function to analyze S3 objects upon creation.
  5. Configure and invoke the AWS Database Migration Service to migrate the initial dataset that we just loaded into RDS to the S3 bucket. Doing this demonstrates the initial load phase of DMS and shows AWS Lambda processing the dataset.
  6. Copy the second dataset into the database. Doing this demonstrates the change data capture feature of DMS and shows AWS Lambda processing the dataset.
  7. Examine the contents of the Amazon S3 bucket to see the results of both the initial migration and the change data capture.
  8. View the results of the analysis in Amazon CloudWatch.

Prerequisites and assumptions

You will need the following:

  • An AWS account that provides access to the services shown in the diagram.
  • Working knowledge of these services.
  • A utility that connects to a PostgreSQL database, such as psql.

Additionally:

  • We will configure all services in the same VPC and region to simplify networking considerations.
  • The VPC must have an S3 endpoint.

(more…)

How to Migrate Your Oracle Data Warehouse to Amazon Redshift Using AWS SCT and AWS DMS

Shree Kenghe is a solutions architect at Amazon Web Services.

This blog post gives you a quick overview of how you can use the AWS Schema Conversion Tool (AWS SCT) and AWS Database Migration Service (AWS DMS) to help you migrate your existing Oracle data warehouse to Amazon Redshift. Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. Migrating your Oracle data warehouse to Amazon Redshift can substantially improve query and data load performance, increase scalability, and save costs.

When migrating your database from one engine to another, you have two major things to consider: the conversion of the schema and code objects, and the migration and conversion of the data itself. AWS has tools to help do both. You can convert schema and code with AWS SCT and migrate data with AWS DMS. AWS DMS helps you migrate your data easily and securely with minimal downtime.

Let’s take a look at the architecture if you are migrating your data warehouse from Amazon RDS for Oracle to Amazon Redshift.

AWS SCT is a client application that can be installed on a workstation, laptop, or Amazon EC2 instance. AWS DMS creates a replication instance that runs on an Amazon EC2 instance in a VPC. You use this replication instance to perform the database migration. The replication instance to be used by DMS, the RDS Oracle database, and Amazon Redshift should all be in the same region. Ideally, they should all be in the same account within the same Availability Zone.

Now that we know how the architecture looks, let’s take a look at the steps to take.

(more…)

Set Up AWS DMS for Cross-Account Migration

Hemant Borole is a big data consultant at Amazon Web Services.

AWS Database Migration Service (AWS DMS) helps you migrate databases to AWS easily and securely. AWS DMS supports homogenous migrations such as Oracle to Oracle, as well as heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora.

For customers who use multiple AWS accounts, the source and target databases may exist on different accounts. For AWS DMS to work with source and target databases in different accounts, it must be able to communicate across accounts.

The following are some of the options you can use to enable cross-account AWS DMS communication:

  • Making database endpoints accessible over the Internet
  • Virtual Private Cloud (VPC) peering
  • Whitelisting the AWS DMS public IP in the security group

Exposing database endpoints over the Internet is not a secure option because it makes the databases vulnerable to security attacks.

VPC peering is a commonly adopted practice. A VPC is a virtual network (block of CIDR) that is allocated to an AWS account. VPC peering allows two VPCs to connect as if they are on the same network. However, this means that both VPCs must have CIDR blocks that do not overlap. Customers might have compliance restrictions that prevent the VPCs from peering.

Solution overview

The use case discussed in this blog post requires the customer to continuously replicate data changes on Amazon Aurora to Amazon Redshift. We use AWS DMS to achieve this replication. AWS DMS uses replication endpoints to migrate data from a source endpoint to a target endpoint. The source and target endpoints can reside in the same Amazon account or in different Amazon accounts. (The setup of the databases is not in the scope of this post.)

AWS DMS uses a replication task to manage replication between the source and the target endpoint. You can configure the replication to be a one-time migration or an ongoing replication. The task also can restrict the replication to certain schemas and tables from the source database. With the AWS DMS task settings, you can configure schemas and tables to be included or excluded from replication.

The AWS DMS task uses rules mapping, which is a JSON definition that allows you to configure this mapping. Here is an example of the rules mapping.

{
  "rules": [
    {
      "rule-type": "selection",
      "rule-id": "1",
      "rule-action": "include",
      "object-locator": {
        "schema-name": "%",
        "table-name": "%"
      },
      "rule-name": "1"
    }
  ]
}

(more…)

Migrating Oracle Database from On-Premises or Amazon EC2 Instances to Amazon Redshift

Ballu Singh and Pubali Sen are solutions architects at Amazon Web Services.

AWS Database Migration Service (AWS DMS) helps you migrate databases to AWS easily and securely. The AWS Database Migration Service can migrate your data to and from most widely used commercial and open-source databases. The service supports homogenous migrations such as Oracle to Oracle. It also supports heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora or Microsoft SQL Server to MySQL. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.

Data replication with AWS Database Migration Service integrates tightly with the AWS Schema Conversion Tool (AWS SCT), simplifying heterogeneous database migration projects. You can use AWS SCT for heterogeneous migrations. You can use the schema export tools native to the source engine for homogenous migrations.

In this blog post, we focus on migrating the data from Oracle Data Warehouse to Amazon Redshift.

In the past, AWS SCT couldn’t convert custom code, such as views and functions, from Oracle Data Warehouse to a format compatible with the Amazon Redshift. To migrate views and functions, you had to first convert the Oracle Data Warehouse schema to PostgreSQL. Then you’d apply a script to extract views and functions that are compatible with Amazon Redshift.

After an update based on customer feedback, we’re happy to let you know that with AWS SCT and AWS DMS, you can now migrate Oracle Data Warehouse to Amazon Redshift along with views and functions.

The following diagram illustrates the migration process.

(more…)