AWS Database Blog

Automate the migration of Microsoft SSIS packages to AWS Glue with AWS SCT

When you migrate Microsoft SQL Server workloads to AWS, you might want to automate migration and minimize changes to existing applications, but still use a cost-effective option without commercial licenses and reduce operational overhead. For example, SQL Server workloads often use SQL Server Integration Services (SSIS) to extract, transform, and load (ETL) data. In this case, you can consider AWS Glue Studio as a cloud alternative for your data integration operations. AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue Studio is a graphical interface that makes it possible to create, run, and monitor ETL jobs in AWS Glue. To accelerate the migration from SSIS to AWS Glue, you can use the AWS Schema Conversion Tool (AWS SCT).

AWS offers different paths for your migration journey from Microsoft SQL Server to the AWS. You can migrate to Amazon Relational Database Service (Amazon RDS) and chose an open source engine or Amazon Aurora to avoid commercial licenses, backup administration, and data center maintenance. Alternatively, you can use Babelfish for Aurora PostgreSQL which understands the SQL Server wire protocol and continue using the existing queries and drivers for your applications. You can also use AWS Database Migration Service (AWS DMS), a managed migration and replication service that helps you move your databases and analytics workloads to AWS quickly and securely.

In this post, we demonstrate how the AWS SCT can automate conversion of your existing SSIS packages to AWS Glue. AWS SCT can also be used to convert SQL Server databases into, for example, Amazon RDS for PostgreSQL or in case of migrating to Babelfish (refer to the Migrate from SQL Server to Amazon Aurora using Babelfish for more detailed steps), you can use the exported SQL/DDL code exported from Microsoft SQL Server and Babelfish Compass tool to assess which features are not compatible with Babelfish for PostgreSQL. After you convert your database schemas and ETL scripts, you can use the AWS Glue jobs to perform the same data operations.

Solution overview

We focus on conversion of SSIS packages that you use to perform ETL operations with the source SQL Server database. The target database used for this post is Babelfish for Aurora PostgreSQL, which provides native, semantically correct execution over the TDS wire protocol, allowing the connection to the target database using the Microsoft SQL Server JDBC, which we will be using for illustration purposes as the connector for AWS Glue. Using the instructions provided you have already converted your SQL Server database schemas and applied the converted code to your target database. Having the target database configured will allow us to convert the SSIS jobs using SCT to AWS Glue studio so that your job can run against your Babelfish target database.

Solution Architecture

An SSIS package includes the necessary components, such as the connection manager, tasks, control flow, data flow, parameters, event handlers, and variables, to run a specific ETL tasks.

You don’t need an installed SSIS server, AWS SCT reads the project files (files with the extensions .dtsx, .conmgr, or .params) from the local folder and parses them. The conversion process goes through the local SSIS files.

Example of SSIS Package conversion to AWS Glue

At a high level, the steps are as follows:

  1. Create an AWS Glue custom connector for SQL Server.
  2. Create an AWS Secrets Manager secret.
  3. Create a connection and test your connector.
  4. Add your SSIS packages as the ETL source and AWS Glue as the ETL target to your AWS SCT project.
  5. Set up a mapping rule and configure connections.
  6. Convert your source SSIS packages.
  7. Configure and deploy an AWS Glue Studio job in AWS SCT.
  8. Run the AWS Glue Studio job.


For this walkthrough, you should have the following prerequisites:

Create an AWS Glue custom connector for SQL Server

In AWS Glue, a connector is an optional code package that assists with accessing data stores in AWS Glue Studio. You can use a natively supported data store, a connector from AWS Marketplace, or your own custom connectors.

You can create custom connectors for Spark, Athena, and JDBC data stores. AWS SCT requires a custom connector to replace the existing SSIS connections, we use the Microsoft SQL Server JDBC connector for illustration purposes.

Complete the following steps:

  1. Download the JAR file.
  2. Create an Amazon Simple Storage Service (Amazon S3) bucket or use an existing bucket.
  3. Upload the downloaded JAR file to your S3 bucket.

After you upload the JDBC connector to Amazon S3, you can set up a custom connector, which holds the required information to set up a connection like class name or JDBC URL format. After you set your custom connector, you can create connections that connect to your data source and data targets in the AWS Glue jobs.

  1. On the AWS Glue console, choose Data connections in the navigation pane.
  2. Choose Create custom connector.

AWS Glue Create custom connector

  1. Under Connector S3 URL, enter the S3 path of your JAR file.
  2. Enter the name for your custom connector, in this example SQLServer
  3. For Connector type, choose JDBC.
  4. Enter for Class name.
  5. Enter jdbc:sqlserver://${host}:${port};databaseName=${dbname};user=${username};password=${password} for JDBC URL base.

By adding a placeholder with {KEY} to the URL base, it allows it to be replaced when you create the connection.

  1. Enter ; for URL parameter delimiter.
  2. Choose Save changes.

Editor connector - connector properties

Create a Secrets Manager secret

To safely store sensitive information of your connection string, you can use Secrets Manager, which you can reference when creating your connection in AWS Glue.

  1. On the Secrets Manager console, choose Store a new secret.
  2. For Secret type, choose Other type of secret.
  3. Enter the key-value pairs for host, port, dbname, username, and password.
  4. Choose Next.
  5. Enter the name for your Secret name, in this example SQL_SERVER_2014
  6. Choose Next.
  7. On the Review page, review your secret details, then choose Store.

Secret Manager - secret details

Create a connection for your custom connector

After you create your custom connector, you can use it to create connections to your specific data sources and targets. To connect to a Babelfish for Aurora PostgreSQL, you can use the port 1433 instead of the PostgreSQL port 5432 using the SQL Server custom connector.

  1. On the AWS Glue Studio console, choose Connectors in the navigation pane.
  2. Choose the connector you created (SQLServer).

connector - create new connection

  1. Choose Create connection.
  2. Enter then Name for your connection, in this case SQLServer2014-connection.
  3. Enter the AWS Secret created earlier, in this example SQL_SERVER_2014.
  4. Specify all necessary parameters in the Network options section, and make sure your network options are able to access your SQL Server database.
  5. Choose Save changes.

connectors - edit connection

You can test your connection by creating a new AWS Glue job and choosing your custom connector as the source.

AWS Glue New Job

Add your SSIS packages as the ETL source and AWS Glue as the ETL target to your AWS SCT project

After you have created and tested your custom connections, you can now focus on migrating your SSIS packages using AWS SCT.

First, you need to create a new AWS SCT project. Complete the following steps:

  1. On the AWS SCT console, choose File, New project.
  2. Provide a name and location to save the new project.

AWS SCT New Project

Now you can add your source ETL engine and target ETL engine.

  1. Choose Add source.
  2. Select SQL Server Integration Service from the category ETL.

AWS SCT Add Source

  1. Choose Next.
  2. Choose Browse, select your local folder with SSIS packages (files with the extensions .dtsx, .conmgr, or .params) and choose Connect.

AWS SCT Chose ssis source

  1. Choose Add target and select AWS Glue Studio from the category ETL.
  2. Specify your AWS profile .To create a profile, refer to Using the AWS SCT user interface and store AWS Service Profile in SCT
  3. Choose Connect.

AWS SCT Add Target

Set up a mapping rule and configure connections

To start a conversion, you need to create a mapping rule. A mapping rule defines the target platform for conversion of your source database schema. To create a mapping rule for your project, specify SSIS as the source and AWS Glue Studio as target, then choose Create mapping.

After the mapping view is displayed, select Main View

If the source SSIS package has connections, use the wizard Configure connections to create a mapping between SSIS and AWS Glue Studio connections. To run the wizard, open the context menu (right-click) on the category Connection managers and choose Configure connections.

To configure the connection mapping in SSIS, connection managers need to specify the SSIS package with the connection, and specify the AWS Glue Studio connection for the corresponding SSIS connection.

Specify AWS Glue Studio connection for the corresponding SSIS connection, created in previous step Create a connection for your custom connector.

Convert your source SSIS packages

The AWS SCT converts your selected SSIS packages to JSON files. These JSON objects represent a node in a Directed Acyclic Graph (DAG). Find your converted files in the Package DAGs node in the right tree.

In the source tree view, find category Packages, open the context (right-click) menu and then choose Convert package.

AWS SCT Convert Package

The SSIS package is transformed into DAG json. You can save these files into Amazon S3 by choosing Package DAGs, opening the context menu (right-click), and choosing Save to Amazon S3.

AWS SCT Save to S3

Configure and deploy your AWS Glue Studio job in AWS SCT

To configure and deploy your AWS Glue Studio job, complete the following steps:

  1. In the target tree view, find the category DAGs.
  2. Open the context menu (right-click) for your converted DAG, then choose Configure AWS Glue Studio Job.

AWS SCT Configure job

You can choose multiple converted DAGs.

  1. Modify the list of DAGs that will be used to create AWS Glue Studio jobs.
  2. Specify AWS Glue Studio job basic properties, then choose Next.
    1. Name – Enter the name of your AWS Glue Studio job. The default is name of converted DAG.
    2. Script filename – Enter the name of your converted script. The default is name of converted DAG.
    3. Job parameters – Enter run-time values that are necessary to run a job.

AWS SCT Basic job properties

  1. Specify advanced job properties (all required properties have default values), then choose Next.
    1. IAM Role – Choose the IAM role that is used for authorization to resources used to run the job and access data stores.
    2. Script file S3 path – Enter the Amazon S3 path to your converted script. The default is Amazon S3 folder “Job” in Amazon S3 folder of connection to AWS Glue Studio.
    3. Encrypt script using SSE-S3 – Choose this option to protect data using server-side encryption with Amazon S3-managed encryption keys (SSE-S3).
    4. Temporary directory – Enter the Amazon S3 path to a temporary directory for intermediate results depending on your job.
    5. Generated python library path – AWS SCT uses this path to automatically generate the path for Python libraries. You can’t edit this automatically generated path. To use additional Python libraries, enter the path in User python library path.
    6. Worker type – Set the type of predefined worker that is used when a job runs. The default is G.1X.
    7. Requested number of workers – The number of workers you want AWS Glue to allocate to this job. The default is 2.
    8. Max concurrency – Enter the maximum number of concurrent runs that are allowed for this job. The default is 1. AWS Glue returns an error when this threshold is exceeded.
    9. Job timeout (minutes) – Enter the timeout value on your ETL job as a safeguard against runaway jobs. The default is 2880 minutes (48 hours) for batch jobs. If the job exceeds this limit, the job run state changes to TIMEOUT.
    10. Delay notification threshold (minutes) – Enter the threshold in minutes before AWS SCT sends a delay notification.
    11. Number of retries – Enter the number of times, from 0 to 10, that AWS Glue should automatically restart the job if it fails. Jobs that reach the timeout limit are not restarted. The default is 0.

AWS SCT Advanced job properties

  1. Choose Finish.
  2. In the target tree view, find and expand ETL Jobs.
  3. Open the context menu (right-click) for the ETL job that you configured and choose Create AWS Glue Studio Job.

AWS SCT Create Glue Studio Job

The Create AWS Glue Studio Job option is available on the root category ETL Jobs as well as on the single object. You can create multiple AWS Glue Studio jobs at the same time.

Run the AWS Glue Studio job

After your job has been created, it’s listed in your jobs list on the AWS Glue Studio console.

From there you can run your converted AWS Glue job and apply any required changes, or, if you prefer, convert and use it as a Spark script.

  1. On the AWS Glue console, choose ETL Jobs in the navigation pane
  2. Chose the job that was converted

AWS Glue Select job to run

  1. You can run it by choosing on Run

AWS Glue Studio job run

  1. You can open/review the Runs tab and verify the status of the job and also access its logs by selecting the Id

AWS Glue job status


In this post, we showed how to convert SSIS packages to AWS Glue Studio jobs using the AWS SCT. It can help you automate your ETL package conversion using the described solution. The described solution is usually part of a database migration from SQL Server to Amazon RDS.

For more information about the AWS SCT, refer to the AWS SCT User Guide.

About the Authors

Felipe Lopes Marins is a Partner Solutions Architect in the Public Sector team at AWS. He helps partners and customers to translate their business challenges into technical solutions using AWS. Data analytics and artificial intelligence are some of his new passions, after previously spending time working more in application development in different technologies such as containers, serverless, and DevOps.

Oleksandr Fedorenko is a Database Engineer in the AWS DMA team. He helps to design and implement successful database migration solutions for SCT project. He is currently in charge of ETL conversion directions. Oleksandr and his team delivered SSIS to Glue and Glue Studio conversions as well as the Informatica Redirect project.

Hugo Mineiro is a Senior Analytics Specialist Solutions Architect based in Geneva. He focuses on helping customers across various industries build scalable and high-performant analytics solutions. At AWS he focuses on strategic accounts in the DACH region and helps customers to accelerate their journey to the cloud and innovate using AWS analytics and machine learning services. He loves playing football, and spending time with friends.