Category: Schema Conversion Tool (SCT)


How to Migrate Your Data Warehouse to Amazon Redshift Using the AWS Schema Conversion Tool Data Extractors

Michael Soo is a database engineer at Amazon Web Services.

This blog post gives you a quick overview of how you can use the AWS Schema Conversion Tool (AWS SCT) data extractors to migrate your existing data warehouse to Amazon Redshift. Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. Migrating your data warehouse to Amazon Redshift can substantially improve query and data load performance, increase scalability, and save costs. AWS provides the Schema Conversion Tool free of charge to help you migrate your legacy data warehouse to the cloud.

Customers usually use SCT to migrate their schema objects (table definitions, indexes, triggers, and other execution logic) from legacy data warehouse platforms. SCT has been enhanced also to migrate the actual data records. This feature, called the SCT data extractors, can migrate your data from any legacy platform (Oracle, Greenplum, Netezza, Teradata, Microsoft SQL Server, or Vertica), using a parallel, secure and fault-tolerant architecture.

Typically, there are two stages to migrating a data warehouse. In the first stage, a full copy of the legacy data is replicated to the target system. We call this stage the “full load.”  The full load can take a considerable amount of time depending on how much data there is to move. This being the case, any intervening changes are usually captured in a subsequent change data capture stage.

The SCT extractors are designed to facilitate the full load process. The extractors are designed with two key features to support the full load process:

  • Quickly move large numbers of records using multi-threaded data extraction and load processes
  • In case of failure, an extraction process can restart from the point of failure, rather than from the beginning of the entire data extract

In the second stage, SCT provides support for point-in-time data extracts so that “change deltas” since the full load can be captured and migrated in a second step.

Let’s take a look at architecture of the SCT extractors and how they can help you accomplish the full data load process.

Architecture
You can install the SCT extractors on any server machines with connectivity to your source data warehouse. The SCT client application issues commands to the extractors to pull data from the source system, upload the data to Amazon S3, and copy the data into Amazon Redshift. You can install multiple extractors on the same machine, or separate them as needed across many machines. Which is best depends on your available server resources and network capacities. In general, a best practice is to locate the data extractors close to your data warehouse to reduce network traffic and latency.

The extractors provide a fully managed framework for your data warehouse migration. The data extractors write their extract files to local storage. The extract files are compressed, using lossless LZO compression, then copied to an S3 bucket. The files are compressed to reduce network transfer time and to reduce storage costs on S3. After the files are uploaded to S3, the extractor triggers Amazon Redshift to copy the data into the target tables using the standard Amazon Redshift COPY command. Each of these steps can be repeated if necessary with no unexpected consequences on the target database.

For security purposes, SCT supports encrypted communication using Secure Socket Layer (SSL) between the SCT client and the extractors. If you want to learn more about using SSL with the extractors, we have a companion blog post to come to explain that process, How to Use SSL with the AWS Schema Conversion Tool Data Extractors. However, before moving on to SSL, we recommend that you read this post first to understand the overall extractor architecture and setup.

Now that we know what the architecture looks like, let’s take a look at the steps involved in setting up the data extractors.

The scenario we work through in this post is migrating an Oracle data warehouse to Amazon Redshift. We assume that you’ve already converted your source schema to Amazon Redshift and applied it on the target database. We won’t try to cover all possible migration scenarios—our intent is to give you a deep dive into the technology so you can confidently manage your migration projects.

Our Oracle database is hosted in Amazon RDS, but the same migration considerations apply to on-premises sources. We’ll install two extractor agents running on Red Hat Linux on Amazon EC2 instances. We’ll only work through the configuration of one extractor; the process is the same for both. The servers are configured as EC2 t2.medium instances (two virtual CPU, 4 GB of memory, and 100 GB of local storage). The instance size values can be tuned for your particular data volume, latency, and performance requirements.

The extractors are controlled by an SCT client running on a Mac laptop. Because the client offloads data processing chores to the extractors, a small desktop or laptop is perfectly suited to run the client.

A great feature of the SCT client is that it doesn’t need to be online while the data extractors are running. The SCT client can reconnect to the extractors at any time by opening the specific migration project. One client can manage many extraction projects. The data extractors continue to operate even if the SCT client is unreachable for any reason.

(more…)

Strategy for Migrating Partitioned Tables from Oracle to Amazon RDS for PostgreSQL and Amazon Aurora with PostgreSQL Compatibility

Sandeep Kariro is a big data architect and Akm Raziul is a database architect at Amazon Web Services.

Enterprises have been using closed-source database systems for more than four decades. Managing data with these systems has been a critical part of running a successful business in every organization. You have to make a considerable investment up front to obtain the required hardware and enterprise licenses before you can even start building the system. With cloud computing gaining momentum in recent years and technology maturing in the open source system world, interest has been growing in moving critical traditional online transaction processing (OLTP) database workloads to open-source systems such as PostgreSQL, MySQL, and others.

Migrating one database system to another requires significant effort and planning. Even though the basics around the database system generally stay the same, implementation and operational activity on each of them can vary quite a bit.

The AWS Schema Conversion Tool (AWS SCT) helps migrate Oracle schemas and custom stored procedure code to a PostgreSQL-compatible target with minimal effort. AWS SCT generates a migration assessment report that clearly identifies objects and code that require manual conversion or rework. One critical need for high volume OLTP applications is to have the data partitioned for performance reasons. During the initial assessment, AWS SCT highlights the Oracle partitions that require manual intervention. This is due to the way current versions of PostgreSQL (as of 9.6.2) handle table partitions. PostgreSQL manages partitioned tables differently from Oracle, SQL Server, or MySQL database partitions.

In this post, we discuss how to convert Oracle Database native range and list partitions to PostgreSQL native range and list partitions.

Range and list partition definitions in Oracle
Range: Each partition holds a range of values based on the partition key. For example, you can partition data based on a date field. Each partition can hold data for a date range such as a week, a month, a quarter, etc. The first partition of the table is always open-ended toward lower values. You can optionally set the last partition to be open-ended also by defining the maximum partition value as MAXVALUE. A range partition can have up to 16 columns.

The following is an example of a table created with a range partition on create date. The table is created with three partitions. The first partition holds all data created with a create date earlier than 31-DEC-2016. The second partition holds all data created with a create date between 01-JAN-2017 and 31-DEC-2017. The third partition holds all other data created with a create date after 31-DEC-2017.

CREATE TABLE person 
	(id NUMBER(10) NOT NULL
	,name VARCHAR2(50)
	,gender VARCHAR2(1)
	,create_dt DATE DEFAULT sysdate
	)
PARTITION BY RANGE(create_dt) (
  partition p1 values less than ('01-JAN-2017'), 
  partition p2 values less than ('01-JAN-2018'), 
  partition p3 values less than (MAXVALUE)
);

(more…)

How to Migrate Your Oracle Database to PostgreSQL

Knievel Co is a database engineer at Amazon Web Services.

This blog post provides an overview on how you can migrate your Oracle database to PostgreSQL. The two biggest parts of a database migration are the schema conversion and the data replication. We go over how to use the AWS Schema Conversion Tool (AWS SCT) and the AWS Database Migration Service (AWS DMS) to tackle these two parts.

Before we cover SCT and DMS, you should take some preliminary steps that have been proven to be helpful for every migration. One way to make the migration easier is to have what is usually called a modernization phase before the migration. This phase involves taking an inventory of objects in your Oracle database and then making a few decisions.

First, deprecate any objects that are no longer needed. Don’t waste time migrating objects that no one cares about. Also, purge any historical data that you no longer need. You don’t want to waste time replicating data you don’t need, for example temporary tables and backup copies of tables from past maintenance. Second, move flat files and long strings stored in LOBs, CLOBs, LONGs, and so on into Amazon S3 or Amazon Dynamo DB. This process requires client software changes, but it reduces the complexity and size of the database and makes the overall system more efficient. Last, try to move any PL/SQL packages and procedures, especially those containing business logic, back to the client software. These objects need to be manually changed anyway if SCT cannot translate them.

The following steps are suggested for migrating to a different database engine (Oracle to PostgreSQL in this case). If you aren’t moving to a different platform, you might find more appropriate native tools or other techniques to move your database.

  1. Create your schema in the target database.
  2. Drop foreign keys and secondary indexes on the target database, and disable triggers.
  3. Set up a DMS task to replicate your data – full load and change data capture (CDC).
  4. Stop the task when the full load phase is complete, and recreate foreign keys and secondary indexes.
  5. Enable the DMS task.
  6. Migrate tools and software, and enable triggers.

Create your schema in the target database

Begin your migration by taking a look at the schema that you want to migrate. In our case, we use the AWS Schema Conversion Tool (AWS SCT) to perform the analysis. When you start the application, you need to create a new project, with the source being Oracle and the target being PostgreSQL. When you’re connected, select the name of the schema that you want to migrate on the left side. Right-click the schema name and choose Convert Schema. Then choose View / Assessment Report View.

The AWS SCT assessment report is a high-level overview of the effort required to convert your Oracle database to PostgreSQL. The following is an example of what the assessment report looks like:

(more…)

Migrate Your Procedural SQL Code with the AWS Schema Conversion Tool

Wendy Neu is a big data architect at Amazon Web Services.

Database administrators and developers rely on relational databases to store data for applications. As Forbes noted in 2016, the development of open source offerings for database management systems like PostgreSQL is causing a growing number of enterprises to migrate to lower-cost solutions for their data storage. The move to the cloud often provides an excellent opportunity to make such a change.

If your system contains a lot of business logic in the database, it can be quite a large task to translate between SQL dialects line by line. Fortunately, Amazon Web Services offers the AWS Schema Conversion Tool (AWS SCT), a standalone application with a project-based user interface that does the heavy lifting of translating SQL dialects for you.

Using AWS SCT
This post demonstrates how easy it is to migrate procedural code from Oracle to PostgreSQL using AWS SCT. It includes how to spot the differences between the two dialects and how to modify the code when needed. You can use AWS SCT to generate a high-level overview of the effort required to convert an Oracle database to Amazon RDS for PostgreSQL. In the example, the tool is running on Windows, but it can run on a number of different operating systems. To download AWS SCT, see Installing and Updating the AWS Schema Conversion Tool. For general information about the tool, start with What Is the AWS Schema Conversion Tool?

The example uses an Oracle database on Amazon EC2 and the well-known SCOTT/TIGER database as a starting point. Three stored procedures are added to demonstrate how AWS SCT handles common dialect translations. For information about installing the base SCOTT schema objects, see the Oracle FAQ wiki. You can download additional Oracle procedures and the translated PostgreSQL examples from this repository.

Prerequisites
The example in this post requires the following:

  • A source Oracle database with SCOTT schema objects installed
  • Three additional stored procedures:
    Procedure Source file
    emp_by_job oracle-emp-by-job.sql
    get_emp_info oracle-get-emp-info.sql
    employee_report oracle-employee-report.sql
  • Target Amazon RDS for PostgreSQL database
  • AWS SCT installed

Generate an assessment report
Typically, the first step in a migration is to understand its feasibility and the effort it will require. You can use AWS SCT to generate a high-level overview of the work ahead. In this example, open the context (right-click) menu for the SCOTT schema, and choose Create Report. The assessment report provides a high-level summary of how much of the schema can be converted automatically from Oracle to PostgreSQL, and the work that will be left after conversion.

The following screenshot shows the results of the assessment.

(more…)

How to Migrate Your Oracle Data Warehouse to Amazon Redshift Using AWS SCT and AWS DMS

Shree Kenghe is a solutions architect at Amazon Web Services.

This blog post gives you a quick overview of how you can use the AWS Schema Conversion Tool (AWS SCT) and AWS Database Migration Service (AWS DMS) to help you migrate your existing Oracle data warehouse to Amazon Redshift. Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. Migrating your Oracle data warehouse to Amazon Redshift can substantially improve query and data load performance, increase scalability, and save costs.

When migrating your database from one engine to another, you have two major things to consider: the conversion of the schema and code objects, and the migration and conversion of the data itself. AWS has tools to help do both. You can convert schema and code with AWS SCT and migrate data with AWS DMS. AWS DMS helps you migrate your data easily and securely with minimal downtime.

Let’s take a look at the architecture if you are migrating your data warehouse from Amazon RDS for Oracle to Amazon Redshift.

AWS SCT is a client application that can be installed on a workstation, laptop, or Amazon EC2 instance. AWS DMS creates a replication instance that runs on an Amazon EC2 instance in a VPC. You use this replication instance to perform the database migration. The replication instance to be used by DMS, the RDS Oracle database, and Amazon Redshift should all be in the same region. Ideally, they should all be in the same account within the same Availability Zone.

Now that we know how the architecture looks, let’s take a look at the steps to take.

(more…)

Introducing Data Extractors in AWS Schema Conversion Tool Version 1.0.602

Eran Schitzer is a product manager at Amazon Web Services.

We are pleased to announce the recent release of version 1.0.602 of the AWS Schema Conversion Tool (AWS SCT). The new version includes a new key capability—data migration agents that extract data from Oracle, MS SQL Server, Teradata, IBM Netezza, Greenplum and HPE Vertica data warehouses, prepare, optimize and upload it to Amazon Redshift.

The new data migration agents are locally installed agents designed to extract data from data warehouses based on Oracle version 11, MS SQL Server version 2008 and later, Teradata version 14.10 and later, IBM Netezza version 7.0.3 and up, Greenplum version 4.3 and up and HPE Vertica version 7.2.2 and up. When data is extracted, the AWS Schema Conversion Tool optimizes it for Amazon Redshift and saves it in local files. AWS can automatically upload these files to Amazon Redshift or you can transfer these files to an Amazon Simple Storage Service (Amazon S3) bucket, where they can be uploaded to Amazon Redshift manually. For large datasets, you can use AWS Snowball to ship the data to AWS.

The migration agents are designed to extract data in parallel and independently. For example, with a large Oracle data warehouse SCT will automatically distribute work between all available migration agents. SCT will automatically manage all the available agents to extract the data from the different partitions and tables in the schema in the most optimized way for Amazon Redshift, consolidate the data, and save it in local files. The migration agents work completely independent from SCT, and you can replace them, if needed, without any work lost. This unique independent parallel execution capability not only accelerates the extraction of the data, but can also withstand failure of one of the agents.

The new migration agents are supported on RedHat and Debian Linux distros, and the SCT client can run on Windows, Mac, Fedora, or Ubuntu. For more information on installation, security settings, and configuration, see Installing Migration Agents in the AWS SCT User Guide.

When the migration agents are installed and configured, you manage them by using the AWS Schema Conversion Tool. The migration agents act as listeners. When they receive instructions from the AWS Schema Conversion Tool, they extract data from your data warehouse. You can find instructions on how to register the migration agents with your AWS Schema Conversion Tool project in Managing Migration Agents in the AWS SCT User Guide.

After your migration extraction tasks complete, your data is ready. The migration agents save your data as files that are compressed, but not encrypted. The migration agents can copy the data files to your Amazon S3 bucket over an Internet connection. Alternatively, you can copy the files to an Amazon S3 bucket using AWS Snowball. Copy the data from the output folder, under the working folder that you specified for your agents.

After uploading your data files to Amazon S3, the data is ready to be uploaded to Amazon Redshift. The extraction task can automatically upload the data to Amazon Redshift or you can use the Amazon Redshift COPY command to load your data to Amazon Redshift. For more information, see Loading Data from Amazon S3 in the Amazon Redshift documentation.

For more information on migration agents, see Using Data Migration Agents in the AWS SCT User Guide.

For more information about version 1.0.602, see the AWS Schema Conversion Tool User Guide. You can download SCT from here.

Migrating Oracle Database from On-Premises or Amazon EC2 Instances to Amazon Redshift

Ballu Singh and Pubali Sen are solutions architects at Amazon Web Services.

AWS Database Migration Service (AWS DMS) helps you migrate databases to AWS easily and securely. The AWS Database Migration Service can migrate your data to and from most widely used commercial and open-source databases. The service supports homogenous migrations such as Oracle to Oracle. It also supports heterogeneous migrations between different database platforms, such as Oracle to Amazon Aurora or Microsoft SQL Server to MySQL. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database.

Data replication with AWS Database Migration Service integrates tightly with the AWS Schema Conversion Tool (AWS SCT), simplifying heterogeneous database migration projects. You can use AWS SCT for heterogeneous migrations. You can use the schema export tools native to the source engine for homogenous migrations.

In this blog post, we focus on migrating the data from Oracle Data Warehouse to Amazon Redshift.

In the past, AWS SCT couldn’t convert custom code, such as views and functions, from Oracle Data Warehouse to a format compatible with the Amazon Redshift. To migrate views and functions, you had to first convert the Oracle Data Warehouse schema to PostgreSQL. Then you’d apply a script to extract views and functions that are compatible with Amazon Redshift.

After an update based on customer feedback, we’re happy to let you know that with AWS SCT and AWS DMS, you can now migrate Oracle Data Warehouse to Amazon Redshift along with views and functions.

The following diagram illustrates the migration process.

(more…)

How to Migrate Your Oracle Database to Amazon Aurora

Ed Murray is a manager at Amazon Web Services.

This blog post gives you a quick overview of how you can use the AWS Schema Conversion Tool (AWS SCT) and AWS Database Migration Service (AWS DMS) to facilitate and simplify migrating your commercial database to Amazon Aurora. In this case, we focus on migrating from Oracle to the MySQL-compatible Amazon Aurora.

Changing database engines can be daunting. However, the value proposition of a highly scalable, cost-effective fully managed service such as Amazon Aurora can make the challenge worth it—especially when there are tools to simplify the process. When migrating your database from one engine to another, there are two major things to consider: the conversion of the schema and code objects, and the migration and conversion of the data itself. Fortunately, AWS has tools to facilitate both the conversion and migration of your database.

The AWS Schema Conversion Tool helps simplify heterogeneous database migrations by automatically converting your source database schema and a majority of the custom code to a format compatible with your new target database. The custom code that the tool converts includes views, stored procedures, and functions. Any code that the tool cannot convert automatically is clearly marked so that you can convert it yourself. The AWS Database Migration Service helps you migrate your data easily and securely with minimal downtime.

Great! So where do we begin?

Working with AWS SCT
Usually, the first step in every migration is an assessment of feasibility and effort. You can use AWS SCT to generate a high-level overview of the effort required to convert your Oracle database to Aurora. The SCT runs on a number of operating systems. For the purposes of this blog, we’ll run the tool on Windows. To download a copy of the SCT, see the Installing and Updating the AWS Schema Conversion Tool documentation. To find overall documentation for SCT, start with What Is the AWS Schema Conversion Tool?

Although this blog post won’t cover the installation and configuration of the SCT, it’s important to point out that you need to install drivers for Oracle and MySQL to connect the SCT to your source and target databases. After connecting to your Oracle source database, you can right-click on any given schema and generate an assessment report. The assessment report tells you at a very high level how much of the schema can be converted automatically from Oracle to Aurora and the work that is left after conversion. You can find an example report following.

AssessmentReport

(more…)

Introducing AWS Schema Conversion Tool Version 1.0.502

Eran Schitzer is a product manager in Amazon Web Services.

We are pleased to announce the recent release of version 1.0.502 of the AWS Schema Conversion Tool (AWS SCT). Two key capabilities included with this release are the use of Amazon Aurora with PostgreSQL compatibility as a target and integration with AWS Database Migration Service (AWS DMS).

Amazon Aurora with PostgreSQL compatibility as a target
Following the announcement of Amazon Aurora with PostgreSQL compatibility preview at re:Invent, you can now use the AWS SCT to convert your schema to use with a PostgreSQL-compatible Aurora DB cluster.

The PostgreSQL-compatible edition of Amazon Aurora delivers up to two to three times the performance of PostgreSQL, along with other PostgreSQL capabilities valuable to customers—user-defined functions and data types, complex SQL support, NoSQL and JSON support, and broad application development language support. At the same time, it takes advantage of the scalability, durability, and security capabilities of Amazon Aurora, all for one-tenth the cost of commercial grade databases such as Oracle Database and Microsoft SQL Server. To sign up for the preview, see the preview webpage.

For more information about converting database schemas, see Converting Database Schema to Amazon RDS by Using the AWS Schema Conversion Tool.

Integration with DMS
After converting a schema with SCT, you can have the associated endpoints and tasks created automatically on a DMS instance. You can run and monitor tasks from either the DMS console or from SCT.

The following procedure shows how to create a DMS task from SCT. Once the schema conversion is complete, the DMS task created through the SCT will migrate the data to the target database engine.

To create and access DMS tasks from SCT, you’ll need to define at least one service profile. You can define one or more service profiles to let SCT access your resources running on AWS, including DMS, RDS and AWS Lambda.

  1. Open SCT, choose Settings, and then choose Global settings.
  2. On the AWS Service Profiles tab, type a name for Profile name, enter your AWS access key and AWS secret key, and choose the region where the profile will be created.GlobalSettings7
  3. Choose Save to save the profile.
  4. Choose Current project settings, and then choose the Project Environment tab. For AWS Service Profile, choose the default profile for the current project, and then choose OK.CurrentProjectSettingsOnce you have defined an AWS service profile, you can define, access, and execute DMS tasks from SCT.
  5. Choose the converted schema you’d like to migrate, open the context (right-click) menu for it, and choose Create DMS Task.CreateDMSTask
  6. In the Create DMS task window, enter the task details—task name, replication instance and source and target endpoints (you can also create new endpoints). You can change the defaults, if needed, for the rest of the parameters.
  7. Choose Create.
    CreateDMSTask1At this point, a DMS task is being created in AWS, as shown by this screenshot from the DMS console.DMSConsole

    You can view, start, stop, resume, delete, refresh, and show logs of the new DMS task in the DMS tasks list in both SCT and DMS.

    DMSOverview

For more information about version 1.0.502, see AWS Schema Conversion Tool User Guide. You can download SCT from here.

Cross-Engine Database Replication Using AWS Schema Conversion Tool and AWS Database Migration Service

Prahlad Rao is a solutions architect at Amazon Web Services.

Customers often replicate databases hosted on-premises to the AWS Cloud as a means to migrate database workloads to AWS, or to enable continuous replication for database backup and disaster recovery. You can both migrate and perform continuous replication on the AWS Cloud with AWS Database Migration Service (AWS DMS).

Although moving from on-premises to the cloud using the same database engine is a common pattern, increasingly customers also use different engines between source and target when replicating. This pattern especially occurs where the source is a commercial database engine and the target is open source. Such an approach can save on licensing costs. In other use cases, the target database might be used as read-only for a reporting application that is compatible with a certain database engine, or the application team is comfortable using an open source database for their needs. Or the customer might be migrating from a commercial database to an open source database on the AWS Cloud for cost and flexibility purposes. This blog post will walk you through the process of replicating a Microsoft SQL Server database to PostgreSQL using AWS Schema Conversion Tool (AWS SCT) and AWS DMS.

AWS DMS helps you migrate databases to AWS with relative ease and security. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The service supports homogenous migrations (like to like databases) and also heterogeneous migrations (between database platforms such as Oracle to Amazon Aurora or SQL Server to MySQL). AWS DMS can also be used for continuous data replication with high availability.

AWS SCT makes heterogeneous database migrations easier by automatically converting source database schema. AWS SCT also converts the majority of custom code, including views and functions, to a format compatible with the target database.

Heterogeneous database replication is a two-step process:

  1. Use AWS SCT to convert the source database schema (from SQL Server) to a format compatible with target database, in this case PostgreSQL.
  2. Replicate data between source and target using AWS DMS.SCT5

(more…)