AWS Database Blog
Deep dive into Babelfish Compass
Babelfish for Aurora PostgreSQL is a capability for Amazon Aurora PostgreSQL-Compatible Edition that enables Amazon Aurora to understand commands from applications written for Microsoft SQL Server. When migrating from SQL Server to Babelfish for Aurora PostgreSQL, the first step is often a feasibility and compatibility assessment. You can use the Babelfish Compass tool to generate a compatibility assessment report which is useful to migrate your SQL Server-based application to Amazon Aurora PostgreSQL-Compatible Edition using Babelfish. In this post, we dive deep into Babelfish Compass and show how you can install it and run assessment reports.
With Babelfish Compass, you can quickly analyze T-SQL SQL/DDL scripts for compatibility with Babelfish. Babelfish Compass identifies the SQL features that aren’t supported by the current version of Babelfish and creates an assessment report on the SQL/DDL. The purpose of such analysis is to inform a go/no-go decision about whether it makes sense to consider starting a migration project from SQL Server to Babelfish and to estimate the efforts required for the migration.
Babelfish Compass is a standalone tool that comes with the Babelfish product and can be used on-premises. It is a separate tool from Babelfish and has its own GitHub repository. The version being used for this post is Babelfish Compass v.2022-12 and a Windows version, although there may be a more recent version by now. It’s important to note that for any new SQL functionality introduced in a Babelfish version, a new Compass version supporting it, will also be available.
Prerequisites
Before installing the Babelfish Compass tool, complete the following prerequisites:
- Install a 64-bit Java Runtime Environment (JRE) version 8 or higher.
- Install a browser of your preference to view the HTML output of the assessment reports.
- On Mac/Linux, make sure you can run a bash script (for example, with #!/bin/bash).
- Include Java JRE in the environmental path.
For this post, we assume that you have a source SQL Server and the ability to generate DDL files for your databases. Otherwise, you can follow along with the provided .sql scripts and download them from the provided links. However, if you prefer to use your own .sql files, simply replace the file names with your own.
Install and run Babelfish Compass
In the following sections we show you the different methods to install Babelfish Compass, run the tool, and create an assessment report depending on your operating system.
Install Babelfish Compass
To install Babelfish Compass on Windows you must:
- Download and unzip
BabelfishCompass.zip
to an installation directory likeC:\BabelfishCompass
- Rename the existing version folder if necessary
By default, Compass reports are located atC:\Users\username\Documents\BabelfishCompass
The following is an example of verifying Babelfish Compass installation on Windows by running the BabelfishCompass.bat -help
command in the C:\BabelfishCompass
directory:
To install Babelfish Compass on Mac or Linux systems you must:
- Download and unzip
BabelfishCompass.zip
to an installation directory like/home/username/BabelfishCompass
- Rename the existing version folder if necessary
By default, Compass reports are located at/home/username/BabelfishCompassReports
- Verify the BabelfishCompass.sh shell script is executable
If it’s not, grant permissions by running the commandchmod + x BabelfishCompass.sh
Running Babelfish Compass
To run Babelfish Compass on Windows systems:
- Open a cmd prompt and navigate to the Babelfish Compass installation directory
- Run the
BabelfishCompass.bat -help
command
To run Babelfish Compass on Mac or Linux systems:
- Open a bash shell and navigate to the Babelfish Compass installation directory
- Run the
BabelfishCompass.sh -help
command
Creating an assessment report
The following command generates a report called Myreport
for the application Northwind from a single input file.
Windows:
C:\BabelfishCompass>BabelfishCompass.bat Myreport C:\Northwind.sql
It automatically opens the generated assessment report in the default browser.
The following is an example of running an assessment report in Windows for the Northwind application. To generate a report for the Northwind Sample Database, run the following command in Windows:
Mac or Linux:
./BabelfishCompass.sh Myreport /tmp/Northwind.sql
Mac: It automatically opens the generated assessment report in the default browser.
Linux: Manually open the report file mentioned in the standard output.
Babelfish Compass generates a report in the form of an HTML file, which is saved in a directory located at C:\Users\username\Documents\BabelfishCompass\MyReport
in Windows operating system and /home/username/BabelfishCompassReports/MyReport
in Mac/Linux. The root directory of the report is automatically determined by the System.getProperty("user.home")
value in Java. It’s not currently possible to set a user-defined location for the report.
The assessment report named Myreport
contains the analysis for the SQL/DDL script or file named Northwind.sql
. After running the assessment, the report opens automatically in your default browser (except Linux). The HTML file generated would have the Executive Summary section at the top of the HTML file provides a quick view of supported and unsupported features and potential workarounds. It’s recommended to store your SQL/DDL files outside the Babelfish installation directory.
This report has a summary of SQL features found that are supported and unsupported by Babelfish.
Reports are located in C:\Users\Administrator\Documents\BabelfishCompass\
under the Myreport
directory.
In addition to the HTML document, Babelfish Compass also generates two other file types: CSV and text files. The .txt file has the same contents as the .html file but without hyperlinks. The CSV file is a feature for advanced users of Compass that provides a less-formatted subset of the information in the HTML file and can be opened as a spreadsheet, helping users quantify the migration size and effort. The following is a sample CSV file.
Exploring key options in Babelfish Compass
In this section, we explore some of the key features in Babelfish Compass.
Add SQL/DDL scripts to the assessment report
When analyzing Babelfish for an application with multiple dependent databases, it’s possible that you may miss analyzing some of the databases. In such cases, you can use the -add
option to import an additional SQL/DDL script to an existing report, perform an analysis, and generate a report.
For instance, suppose you have already analyzed a database named Northwind and created a report called Myreport
. However, there is another dependent database called Adventure Works that you forgot to analyze and would like to add to the existing report. You can run the following command to import the instawdb.sql
script and add it to the Myreport
report:
Replace an existing input file
The -replace
option allows you to replace an already-imported SQL/DDL script in an existing report and generate a new report after performing an analysis. If a report with the same name already exists, the option overwrites it. See the following code:
Report options in Babelfish Compass
The -reportoption
flag in Babelfish Compass allows for the customization of the final assessment report. Options can be specified using a comma-separated list or multiple -reportoption
flags. By default, the cross-reference feature is turned off to prevent an excessively long report. Options include xref, xref=all, xref=feature,
and xref=object
, which generate cross-references for unsupported or review items ordered by SQL feature or objects. It’s important to note that enabling the xref option for large schemas may result in longer load times for the report. Refer to the documentation for further information on each option.
To generate a cross-reference with the exact location of each non-supported feature, run the following command line:
You can review these using the cross-reference report, which links to the exact location of each non-supported feature. The following screenshot shows the cross-reference section. You can choose the line number, which takes you to the original SQL for the feature, so you can see the exact statement and the context in which it occurs. For example, let’s review the line 5247.
When you choose line 5247 in the report, it highlights the SQL code where PIVOT
table issue is.
Multi-input file run
With Babelfish Compass, you can perform a comprehensive analysis of all databases associated with an application or business unit. This option supports the combined analysis of multiple input scripts and applications, enabling you to generate a report that provides a holistic view of your database landscape. By using the multi-input file run option, you can assess several scripts or SQL files together and identify which applications contribute to each line item.
Additionally, with the -reportoption apps
option, the generated report will show the contribution of each input file.
To run multiple files together using Babelfish Compass, run the following command, which generates a report for multiple input files Northwind.sql
(Northwind) and instawdb.sql
(Adventure Works):
Group supported and unsupported features by app name
With the -reportoption apps
option, you can group multiple files given for assessment by the supported or unsupported features with the app name. This option is helpful when creating a report for multiple applications and wanting to find commonalities in all the applications. See the following code:
For example, in the following report, seven cases of DISABLE TRIGGER
were found, which was common in both applications. This appeared one time in AdventureWorks
and six times in the Northwind
database.
Automatic rewriting of unsupported features
Using the -rewrite
option in Babelfish Compass can help address some SQL features that aren’t currently supported by Babelfish. For instance, you can rewrite the MERGE statement in a way that Babelfish can process it. By specifying the -rewrite
option, the assessment report will include a section with the specific rewritten features, and a subdirectory called rewritten
will be created in the report directory containing the original SQL source file with the rewritten features. When using the -reportoption xref
, the cross-reference links in the rewritten
sections point to the rewritten SQL file.
However, note that using -rewrite
may cause Babelfish Compass to run slower, especially for large files with many rewritten features. In that case, first run an analysis without -rewrite
and then rerun with the -analyze -rewrite
flags when Compass identifies rewrite opportunities in the report. Using the -rewrite
option can be a useful tool to address unsupported SQL features and improve the overall processing of SQL files in Babelfish Compass.
To better understand how the Babelfish Compass -rewrite
feature works, let’s walk through an example using a procedure from the wide-world-importers
sample database available on GitHub. Instead of using the entire database, we focus on this specific feature.
To follow along, you can download the sample SQL file using the following curl command:
First, let’s run Babelfish Compass without the -rewrite
flag to see how it handles the MERGE statement:
As shown in the screenshot, the MERGE statement is categorized as Not Supported
.
Let’s now run Babelfish Compass again, this time with the -analyze
and -rewrite
flags (note that we have excluded the input file name):
The latest run with the specified flags shows that the MERGE statement has been successfully rewritten a series of INSERT, UPDATE or DELETE statements, which Babelfish does support, as depicted in the following screenshot.
Scroll down to the Rewritten by Babelfish Compass
section.
You can also see the rewritten
directory for the SQL rewritten by Babelfish Compass.
Open the SQL file and scroll down to the section where the original MERGE statement was found, and you can see the MERGE statement has been rewritten by Compass.
What is the Babelfish features configuration file?
The BabelfishFeatures.cfg
file is located in the Babelfish Compass installation directory. The compatibility assessment performed by Babelfish Compass tool is driven by this file. This file contains definitions of features that are and aren’t supported in specific Babelfish versions. The contents of this file may sometimes look cryptic and may not always be obvious.
For each Babelfish release, a new BabelfishFeatures.cfg
is also released. In case Babelfish Compass is already installed, the existing version of BabelfishFeatures.cfg
will be overwritten by the newer version of this file.
Don’t edit, modify, or rename the BabelfishFeatures.cfg
file. If you make changes, Babelfish Compass will detect changes and stop immediately.
When you open the BabelfishFeatures.cfg
file, note the following:
- The
rule=
lines are for internal use and don’t have any relevance to the user. Also, you’re not expected to be able to use this .cfg file directly. It’s the Compass tool that produces user-consumable output. - If a feature is listed in this file, by default it is not supported.
- Features like SQL-graph aren’t supported.
- The
complexity_score =
lines in the .cfg file refers to the complexity score that helps Compass users estimate the amount of work required to address unsupported items. - It contains a value LOW, MEDIUM, or HIGH, reflecting a very rough complexity estimate for resolving the item in question.
The features that are not supported may fall into one of these categories:
- Not Supported – This feature is not currently supported by Babelfish
- Review Semantics – This feature involves aspects that can’t be addressed by Babelfish but require review to determine whether or not it requires changes to be made as part of the migration process
- Review Performance – This feature involves a performance-related aspect in SQL Server and therefore it needs to be reviewed carefully to see if performance is impacted when running on Babelfish
- Review Manually – This feature needs to be manually examined, because it can’t be assessed by Babelfish Compass
- Ignored – This feature is currently ignored by Babelfish
Override the default classification and report group
In version 2022-12 of Babelfish Compass, you can specify a user-defined .cfg file to be used with the -userconfigfile
option, while the default file remains BabelfishCompassUser.cfg
. However, using BabelfishCompassUser.cfg
is not recommended for new users. Experienced users can use it to customize their assessment reports by focusing more or less on specific SQL features. BabelfishCompassUser.cfg
contains all the sections present in BabelfishFeatures.cfg
, such as [Datatypes]
or [Built-in functions]
, and you should not modify these section headers, but can add certain items to a section. It’s important to note that any modifications made to the BabelfishCompassUser.cfg
will not be saved or stored by Babelfish Compass, so it must be properly backed up.
If the BabelfishCompassUser.cfg
file doesn’t exist, Babelfish Compass will create it. If new sections have been defined in BabelfishFeatures.cfg
that aren’t yet in BabelfishCompassUser.cfg
, the new sections will be appended. However, if you manually delete sections from BabelfishCompassUser.cfg
, those sections will be appended again the next time Babelfish Compass runs.
It’s essential to keep in mind that user-defined overrides are applied during analysis, and any overridden values are recorded in the captured items. The assessment report is generated from these captured items. Therefore, if you want to apply modified override entries in BabelfishCompassUser.cfg
to a report, the -analyze
flag should be used. When a user-defined override is applied, the captured items will reflect the values after the overrides have been applied, and the original values will be lost.
To illustrate, Babelfish Compass categorizes CLUSTERED
indexes and constraints as Review Semantics
by default. If you don’t consider these aspects and prefer to exclude them from the assessment report, you can modify the classification by adding "default_classification=Ignored"
in BabelfishCompassUser.cfg
:
[CLUSTERED index]
default_classification=Ignored
It’s important to note that the changes made to the classification of SQL features in the Babelfish Compass report have no impact on how Babelfish processes those features. The changes only affect how the features are classified in the report. Therefore, if you decide to ignore certain SQL features in the assessment report by overriding their classification, Babelfish will still process those features as usual, and their impact on your database may still need to be considered.
Import the assessment results to a PostgreSQL database for further analysis
In Babelfish Compass, the -pgimport
option is responsible for connecting to a PostgreSQL instance and loading captured items into a database table. When this option is run, Babelfish Compass creates pg_import.bat
(on Mac/Linux, pg_import.sh
) and pg_impoGenerating
an Assessment Report using Babelfish Compassrt.psql files in the captured
directory. The pg_import.bat
or pg_import.sh
file runs pg_import.psql
, which runs a CREATE TABLE and COPY statement in PostgreSQL. Babelfish Compass runs this function through a subprocess.
To establish a connection to the PostgreSQL instance, you must provide connection attributes on the command line, including the PostgreSQL user name and password. These attributes are not saved in any file but are supplied as environment variables in the short-lived subprocess. These environment variables are not accessible from outside the spawned subprocess, but it’s worth noting that the connection attributes might be accessible through the command-line history in the command-line session that runs Babelfish Compass.
Note that the user is responsible for owning the PostgreSQL instance and granting access to the uploaded data. To use the -pgimport
function, your system must have the PostgreSQL psql client installed and accessible through the PATH
. The default table for data import is public.BBFCompass
, but you can specify a different name using the -pgimporttable
option.
Table structure and its attribute details can be found in Babelfish Compass documentation. To better understand how the Babelfish Compass -pgimport
feature works, let’s walk through an example using a sample databases available on GitHub.
To follow along, you can download the sample database SQL file using the following curl command:
First, let’s run Babelfish Compass using the following command:
Next, import all captured items into the PostgreSQL database table:
To replace with your own database credentials, use the following format: host, port, username, password, dbname
. In this example, we use Amazon Aurora Serverless V2 (PostgreSQL-compatible edition). To create a serverless database, refer to Creating a cluster that uses Aurora Serverless v2.
Query example
To locate specific information, you can run SQL queries against the imported items. For instance, if you want to find all stored procedures that have at least two parameters and at least one of the parameters is of type MONEY
, you can use the following SQL query:
Conclusion
Babelfish Compass can help those considering migrating from SQL Server to Babelfish for Aurora PostgreSQL. It helps you quickly analyze T-SQL SQL/DDL scripts for compatibility with Babelfish, identifying supported and unsupported SQL features, and providing guidance on modifications needed during the migration process. In this post, we showed you the process of installing and generating a basic compatibility assessment report using Babelfish Compass for PostgreSQL. From there, we dove into more advanced topics such as uploading the analysis details to PostgreSQL, customizing the Compass reports by overriding default classifications, and grouping reports using BabelfishCompassUser.cfg
. Overall, Babelfish Compass offers a practical solution for anyone looking to migrate their SQL Server-based application to Amazon Aurora PostgreSQL-Compatible Edition using Babelfish.
For more information about the Babelfish Compass, see the Babelfish Compass documentation.
If you have any questions, comments, or suggestions, leave a comment below.
About the authors
Ramesh Kumar Venkatraman is a Senior Solutions Architect at AWS who is passionate about containers and databases. He works with AWS customers to design, deploy and manage their AWS workloads and architectures. In his spare time, he loves to play with his two kids and follows cricket.
Neha Gupta is a Solutions Architect at AWS and have 16 years of experience as a Database architect/ DBA. Apart from work, she’s outdoorsy and loves to dance.