AWS Database Blog

Federated query support for Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL – Part 2

This post is a continuation of Federated query support for Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL – Part 1. To recap, in 2003, the SQL standard added a new specification called SQL/MED (SQL Management of External Data). In 2011, PostgreSQL 9.1 was released with read-only support of this standard, and PostgreSQL 9.3 added write support in 2013. PostgreSQL implements SQL/MED through an interface called foreign data wrappers (FDW). There are several FDWs that let PostgreSQL connect to different external data sources, ranging from SQL database engines to flat files.

FDWs are helpful if you organization uses external data sources that you’d like to query from your PostgreSQL database. For example, you may want to analyze data using Amazon QuickSight and use an Amazon Relational Database Service (Amazon RDS) for PostgreSQL instance as the data source. If some of your data set resides in a separate Amazon RDS for MySQL database instance, you can use the MySQL FDW (mysql_fdw) from your RDS for PostgreSQL instance to keep the source database the same for Amazon QuickSight.

In part 1, we discussed how you can use one implementation of PostgreSQL FDW via the extension postgres_fdw to interact with data residing in other PostgreSQL-based databases from within Amazon Aurora PostgreSQL-Compatible Edition and Amazon RDS for PostgreSQL. In this post, we review other FDW extensions that let you use your Amazon RDS for PostgreSQL database instance to query MySQL-compatible engines, Microsoft SQL Server and Sybase engines that support the tabular data stream (TDS) protocol, and Oracle databases.

Solution overview

In June 2021, Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL compatible started supporting the oracle_fdw extension, which allows your PostgreSQL database to connect and retrieve data stored in separate Oracle databases.

In March 2022, Amazon RDS for PostgreSQL added support for mysql_fdw, which allows your PostgreSQL database to connect and retrieve data stored in separate databases in Amazon Aurora MySQL-Compatible Edition, Amazon RDS for MySQL, and Amazon RDS for MariaDB, as well as self-managed MySQL/MariaDB instances on Amazon Elastic Compute Cloud (Amazon EC2). Amazon RDS for PostgreSQL and Amazon Aurora with PostgreSQL compatibility also added support for tds_fdw, which allows your PostgreSQL database to connect and retrieve data stored in separate SQL Server databases.

The mysql_fdw extension is supported on Amazon RDS for PostgreSQL version 14.2, 13.6 and higher releases while tds_fdw is supported by both Amazon RDS for PostgreSQL and Amazon Aurora with PostgreSQL compatibility.

The following table represents the extension name, supported DML and the versions supported for both RDS for PostgreSQL and Aurora PostgreSQL Compatible-edition.

Extension Name Supported DML RDS for PostgreSQL Support Aurora PostgreSQL-compatible Support
postgres_fdw Reads and writes All available versions All available versions
mysql_fdw Reads and writes 14.2, 13.6 and higher releases. Not supported
oracle_fdw Reads and writes 12.7 or higher releases. 12.7 or higher releases.
tds_fdw Reads only 14.2, 13.6 and higher releases. 14.2, 13.6 and higher releases.

In the following sections, we learn how to set up the three forms of FDWs and use them for tasks such as performing complex joins and aggregates on data residing in different data sources like MySQL-compatible, SQL Server, and Oracle databases with data residing in PostgreSQL to derive insights.

Prerequisites

Before getting started, make sure you have the following prerequisites:

  • A primary Aurora PostgreSQL-compatible or RDS for PostgreSQL instance as your source machine.
  • A remote database instance with information like user name, password, and database name. You can use any of the following databases:
    • MySQL (Aurora, Amazon RDS, or self-managed on Amazon EC2 or on premises) for mysql_fdw
    • SQL Server (Amazon RDS or self-managed on Amazon EC2 or on premises) for tds_fdw
    • Oracle (Amazon RDS or self-managed on Amazon EC2 or on premises) for oracle_fdw
  • Network connectivity between the primary and remote database. For more information, see How can I troubleshoot connectivity to an Amazon RDS DB instance that uses a public or private subnet of a VPC?
  • Tables that you query in the foreign (remote) server. For this post, we create one table in the source database pgfdwsource and one table in target database (MySQL, Oracle, or SQL Server).

In pgfdwsource database, create a salary table with the columns emailid and salary with dummy content. See the following script:

psql -h pgfdwsource.xxxx.us-west-2.rds.amazonaws.com -d pgfdwsource -U pgfdwsource -w

CREATE TABLE salary(emailid varchar, salary int);

INSERT INTO salary VALUES('abc@example.com',10000),('pqr@example.com',100000),('xyz@example.com',1000000);

For the target database, we create a table that corresponds to customer1 with the columns id, name, emailid, projectname, and contactnumber with dummy content. The following sections show how to set up this table in each database engine.

Setting up the MySQL database

For a target MySQL foreign database, use the following script:

mysql -h foreigndbmysql.xxxx.us-west-2.rds.amazonaws.com -u foreigndbmysql -p

CREATE DATABASE foreigndbmysql;

USE foreigndbmysql;

CREATE TABLE customer1(id int, name varchar(40), emailid varchar(40), projectname varchar(40), contactnumber bigint);

INSERT INTO customer1 VALUES(1,'abc','abc@example.com','Customer1.migration',328909432);

INSERT INTO customer1 VALUES(2,'pqr','pqr@example.com','Customer1.etl',2328909432);

INSERT INTO customer1 VALUES(3,'xyz','xyz@example.com','Customer1.infra',328909432);

Setting up the Oracle database

For a target Oracle foreign database, use the following script:

//Following script is Oracle compatible and has been run onto Oracle DB server using SQL Developer client with the following credentials:

Endpoint: oracledb.xxx.ap-south-1.rds.amazonaws.com

DB Name: ADMIN

Username: Admin

Password: <password>


CREATE TABLE customer1 (id NUMBER(10), name VARCHAR2(30), emailid VARCHAR2(30), projectname VARCHAR2(30), contactnumber NUMBER(20));

INSERT INTO customer1 VALUES (1,'abc','abc@example.com','Customer1.migration',328909432);

INSERT INTO customer1 VALUES (2,'pqr','pqr@example.com','Customer1.etl',2328909432);

INSERT INTO customer1 VALUES (3,'xyz','xyz@example.com','Customer1.infra',328909432);

Setting up the SQL Server database

For a target SQL Server foreign database, use the following script:

//Following script is sql server compatible and has been run onto Sql Sever DB server using SQL Server Management Studio with the following credentials:

Endpoint: sqlservertarget.xxxx.us-west-2.rds.amazonaws.com

DB Name: fdwtarget

Username: sqlservertarget

Password: <password> 

CREATE TABLE customer1( id int, name varchar(40), emailid varchar(40), projectname varchar(40), contactnumber bigint);

INSERT INTO customer1 VALUES(1,'abc','abc@example.com','Customer1.migration',328909432);

INSERT INTO customer1 VALUES (2,'pqr','pqr@example.com','Customer1.etl',2328909432);

INSERT INTO customer1 VALUES (3,'xyz','xyz@example.com','Customer1.infra',328909432);

In all tables, the column emailid is common and can be used to derive insights into the salary corresponding to the customer1 employee information.

In the upcoming sections of this post, we set up a connection in between the source and target database and run federated queries to derive insights from the data present in MySQL compatible, SQL Server, and Oracle with PostgreSQL database engine.

Connect to the source instance

All the steps in the following sections are performed after logging in with the role pgfdwsource into the primary database instance pgfdwsource.

You connect to your source PostgreSQL instance with a primary user or via a normal user that has rds_superuser permissions. You can use client tools like psql or pgAdmin. For more information, see Connecting to a DB instance running the PostgreSQL database engine. Create the extension mysql_fdw (for connecting with a MySQL foreign database), oracle_fdw (for connecting with an Oracle foreign database), and tds_fdw (for connecting with a SQL Server foreign database) with CREATE EXTENSION:

psql -h pgfdwsource.xxxx.us-west-2.rds.amazonaws.com -d pgfdwsource -U pgfdwsource -w

\conninfo //to show current connection information

\du pgfdwsource // to list all users in the current database server

CREATE EXTENSION IF NOT EXISTS mysql_fdw; //for connecting to MySQL foreign DB

CREATE EXTENSION IF NOT EXISTS oracle_fdw; //for connecting to Oracle foreign DB

CREATE EXTENSION IF NOT EXISTS tds_fdw; //for connecting to SQL Server foreign DB

\dx // to list all installed extensions

Create a foreign server

We use CREATE SERVER to create our foreign (remote) server to encapsulate connection information that an FDW uses to access an external data resource.

For a target MySQL foreign server, see the following script:

CREATE SERVER mysqldb FOREIGN DATA WRAPPER mysql_fdw OPTIONS (host 'foreigndbmysql.xxxx.us-west-2.rds.amazonaws.com', port '3306');

For a target Oracle foreign server, see the following script:

CREATE SERVER oradb FOREIGN DATA WRAPPER oracle_fdw OPTIONS (dbserver '//foreigndboracle.xxxx.ap-south-1.rds.amazonaws.com:1521/ADMIN');

For a target SQL Server foreign server, see the following script:

CREATE SERVER sqlserverdb FOREIGN DATA WRAPPER tds_fdw OPTIONS (servername 'sqlservertarget.xxxx.us-west-2.rds.amazonaws.com', port '1433', database 'fdwtarget');

By default, PostgreSQL only lets a privileged account create and access a foreign server. In Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL-Compatible Edition, this is any user member of the rds_superuser role. To allow an unprivileged user to access data on a foreign server, you must explicitly grant the USAGE privilege on that foreign server to the user. The following command provides an example for how to grant usage on a foreign server to an unprivileged user:

GRANT USAGE ON FOREIGN SERVER my_fdw_target TO normal_user;

Create a user mapping

CREATE USER MAPPING defines a mapping of a user to a foreign server. A user mapping encapsulates connection information that a foreign-data wrapper uses with the information from a foreign server to connect to an external data resource.

The owner of a foreign server can create user mappings for that server for any user.

In the following script, when the user pgfdwsource is connecting to the foreign server (remote database), they use the login information provided in the user mapping.

For a target MySQL foreign server, see the following script:

CREATE USER MAPPING FOR pgfdwsource SERVER mysqldb OPTIONS (username 'foreigndbmysql', password '<password>');

For a target Oracle foreign server, see the following script:

CREATE USER MAPPING FOR pgfdwsource SERVER oradb OPTIONS (user 'admin', password '<password>');

For a target SQL Server foreign server, see the following script:

CREATE USER MAPPING FOR pgfdwsource SERVER sqlserverdb OPTIONS (username 'sqlservertarget', password '<password>');

Create a foreign table

CREATE FOREIGN TABLE creates a foreign table in the source database that is mapped to the table in the foreign server. For example, in the following script, the foreign table customer1_fdw points to the table customer1, which resides in the remote database. When you need to query data from table customer1 in the remote database, the user in the local database queries using the foreign table customer1_fdw.

For a target MySQL foreign server, see the following script:

CREATE FOREIGN TABLE mysqltab (id int, name varchar, pgfdwsource(emailid varchar, projectname varchar, contactnumber bigint) SERVER mysqldb OPTIONS (dbname 'foreigndbmysql', table_name 'customer1');

For a target Oracle foreign server, see the following script:

CREATE FOREIGN TABLE oracletab ( id int, name varchar, emailid varchar, projectname varchar, contactnumber bigint) SERVER oradb OPTIONS (table 'CUSTOMER1');

For a target SQL Server foreign server, see the following script:

CREATE FOREIGN TABLE sqlservertab ( id int, name varchar, emailid varchar, projectname varchar, contactnumber bigint) SERVER sqlserverdb OPTIONS (table 'customer1');

Furthermore, the name of the foreign table or table reference can be different from the name of the table in the remote database. This flexibility can be useful when you want to make the database name consistent for the application that’s accessing the foreign table from the PostgreSQL database.

Run federated queries with FDW

mysql_fdw and oracle_fdw allow you to run SELECT, INSERT, UPDATE, and DELETE statements in a remote MySQL or Oracle database. tds_fdw allows you to run only SELECT statements on a remote SQL Server database. Unlike other PostgreSQL foreign data wrappers, tds_fdw doesn’t support write mode, and can only read from the mapped foreign table within SQL Server. Alternatively, if you are working with Amazon Aurora PostgreSQL-Compatible Edition, you can work with Babelfish for Aurora PostgreSQL. With Babelfish, Amazon Aurora PostgreSQL understands T-SQL, Microsoft SQL Server’s proprietary SQL dialect, and supports the same communications protocol, so your application that were originally written for SQL Server can work with Aurora with fewer code changes. Refer to Get started with Babelfish for Aurora PostgreSQL to learn more.

With the preceding steps complete, you can now query the table customer1, which resides in the corresponding foreign server using the appropriate FDW, from within the instance pgfdwsource.xxxx.us-west-2.rds.amazonaws.com, without the need for replication or any kind of pipeline. See the following script:

For a target MySQL foreign server, see the following script:

psql -h pgfdwsource.xxxx.us-west-2.rds.amazonaws.com -d pgfdwsource -U pgfdwsource -w

-- running select * on foreign table customer1 mapped to name customer1_fdw in source instance.

SELECT * FROM mysqltab;

id | name | emailid | projectname | contactnumber
----+-------+---------------------+---------------------+---------------
1 | abc | abc@example.com | Customer1.migration | 328909432

2 | pqr | pqr@example.com | Customer1.etl | 2328909432

3 | xyz | xyz@example.com | Customer1.infra | 328909432

You can also run join queries to perform aggregates (and more) to derive insights.

-- running select * on foreign table customer1 mapped to name customer1_fdw in source instance while joining it with salary table to derive salary insights.

SELECT * FROM salary s INNER JOIN mysqltab c ON s.emailid=c.emailid;

emailid | salary | id | name | emailid | projectname | contactnumber
---------------------+---------+----+-------+---------------------+---------------------+---------------
pqr@example.com | 100000 | 2 | pqr | pqr@example.com | Customer1.etl | 2328909432

xyz@example.com | 1000000 | 3 | xyz | xyz@example.com | Customer1.infra | 328909432

abc@example.com | 10000 | 1 | abc | abc@example.com | Customer1.migration | 328909432

In the following script, only the data for a particular project is queried. The project name details are stored in the remote server. In the explain plan of the query, the WHERE clause has been pushed down to the remote server. This ensures that only the data required to realize the result of the query is retrieved from the remote server and not the entire table. This is one of the optimizations that FDW offers that essentially reduces the data transferred over the network as well.

SELECT * FROM salary s INNER JOIN mysqltab c ON s.emailid=c.emailid WHERE c.projectname LIKE 'Customer1.etl';

emailid | salary | id | name | emailid | projectname | contactnumber
---------------------+--------+----+-------+---------------------+---------------+---------------
pqr@example.com | 100000 | 2 | pqr | pqr@example.com | Customer1.etl | 2328909432
EXPLAIN VERBOSE SELECT * FROM salary s INNER JOIN mysqltab c ON s.emailid=c.emailid WHERE c.projectname LIKE 'Customer1.etl';

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=1163.00..1263.25 rows=6350 width=144)
Output: s.emailid, s.salary, c.id, c.name, c.emailid, c.projectname, c.contactnumber
Merge Cond: ((c.emailid)::text = (s.emailid)::text)
-> Sort (cost=1074.83..1077.33 rows=1000 width=108)
Output: c.id, c.name, c.emailid, c.projectname, c.contactnumber
Sort Key: c.emailid
-> Foreign Scan on public.mysqltab c (cost=25.00..1025.00 rows=1000 width=108)
Output: c.id, c.name, c.emailid, c.projectname, c.contactnumber
Remote server startup cost: 25
Remote query: SELECT `id`, `name`, `emailid`, `projectname`, `contactnumber` FROM `foreigndbmysql`.`customer1` WHERE ((`projectname` LIKE BINARY 'Customer1.etl'))
-> Sort (cost=88.17..91.35 rows=1270 width=36)
Output: s.emailid, s.salary
Sort Key: s.emailid
-> Seq Scan on public.salary s (cost=0.00..22.70 rows=1270 width=36)
Output: s.emailid, s.salary

For a target Oracle foreign server, see the following script:

psql -h pgfdwsource.xxxx.us-west-2.rds.amazonaws.com -d pgfdwsource -U pgfdwsource -w

-- running select * on foreign table customer1 mapped to name customer1_fdw in source instance.

SELECT * FROM oracletab;

id | name  |       emailid       |     projectname     | contactnumber
----+-------+---------------------+---------------------+---------------
1 | abc   | abc@example.com   | Customer1.migration |     328909432

2 | pqr | pqr@example.com | Customer1.etl       |    2328909432

3 | xyz  | xyz@example.com  | Customer1.infra     |     328909432

You can also run join queries to perform aggregates (and more) to derive insights.

-- running select * on foreign table customer1 mapped to name customer1_fdw in source instance while joining it with salary table to derive salary insights.

SELECT * FROM salary s INNER JOIN oracletab c ON s.emailid=c.emailid;

emailid       | salary  | id | name  |       emailid       |     projectname     | contactnumber
---------------------+---------+----+-------+---------------------+---------------------+---------------
abc@example.com   |   10000 |  1 | abc   | abc@example.com   | Customer1.migration |     328909432

pqr@example.com |  100000 |  2 | pqr | pqr@example.com | Customer1.etl       |    2328909432

xyz@example.com  | 1000000 |  3 | xyz | xyz@example.com  | Customer1.infra     |     328909432

In the following script, only the data for a particular project is queried. The project name details are stored in the remote server. In the explain plan of the query, the WHERE clause has been pushed down to the remote server. This makes sure that only the data required to realize the result of the query is retrieved from the remote server and not the entire table. This is one of the optimizations that FDW offers that essentially reduces the data transferred over the network as well.

SELECT * FROM salary s INNER JOIN oracletab c ON s.emailid=c.emailid WHERE c.projectname LIKE 'Customer1.etl';

emailid       | salary | id | name  |       emailid       |  projectname  | contactnumber
---------------------+--------+----+-------+---------------------+---------------+---------------
pqr@example.com | 100000 |  2 | pqr | pqr@example.com | Customer1.etl |    2328909432
EXPLAIN VERBOSE SELECT * FROM salary s INNER JOIN oracletab c ON s.emailid=c.emailid WHERE c.projectname LIKE 'Customer1.etl';

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Join  (cost=10038.58..20263.33 rows=6350 width=196)
Output: s.emailid, s.salary, c.id, c.name, c.emailid, c.projectname, c.contactnumber
Hash Cond: ((c.emailid)::text = (s.emailid)::text)
->  Foreign Scan on public.oracletab c  (cost=10000.00..20000.00 rows=1000 width=160)
Output: c.id, c.name, c.emailid, c.projectname, c.contactnumber
Oracle query: SELECT /*2eb7c210945975c7fe1e83823a7c9e3f*/ r2."ID", r2."NAME", r2."EMAILID", r2."PROJECTNAME", r2."CONTACTNUMBER" FROM "ADMIN"."CUSTOMER1" r2 WHERE (r2."PROJECTNAME" LIKE 'Customer1.etl' ESCAPE '\')
         Oracle plan: SELECT STATEMENT
         Oracle plan:   TABLE ACCESS FULL CUSTOMER1  (filter "R2"."PROJECTNAME" LIKE 'Customer1.etl' ESCAPE )
->  Hash  (cost=22.70..22.70 rows=1270 width=36)
Output: s.emailid, s.salary
->  Seq Scan on public.salary s  (cost=0.00..22.70 rows=1270 width=36)
Output: s.emailid, s.salary

For a target SQL Server foreign server, see the following script:

psql -h pgfdwsource.xxxx.us-west-2.rds.amazonaws.com -d pgfdwsource -U pgfdwsource -w

-- running select * on foreign table customer1 mapped to name customer1_fdw in source instance.

SELECT * FROM sqlservertab ;

NOTICE: tds_fdw: Query executed correctly

NOTICE: tds_fdw: Getting results

id | name | emailid | projectname | contactnumber
----+-------+---------------------+---------------------+---------------
1 | abc | abc@example.com | Customer1.migration | 328909432

2 | pqr | pqr@example.com | Customer1.etl | 2328909432

3 | xyz | xyz@example.com | Customer1.infra | 328909432
-- running select * on foreign table customer1 mapped to name customer1_fdw in source instance while joining it with salary table to derive salary insights.

You can also run join queries to perform aggregates (and more) to derive insights.

SELECT * FROM salary s INNER JOIN sqlservertab c ON s.emailid=c.emailid;
NOTICE: tds_fdw: Query executed correctly

NOTICE: tds_fdw: Getting results

emailid | salary | id | name | emailid | projectname | contactnumber
---------------------+---------+----+-------+---------------------+---------------------+---------------
abc@example.com | 10000 | 1 | abc | abc@example.com | Customer1.migration | 328909432

pqr@example.com | 100000 | 2 | pqr | pqr@example.com | Customer1.etl | 2328909432

xyz@example.com | 1000000 | 3 | xyz | xyz@example.com | Customer1.infra | 328909432

In the following script, only the data for a particular project is queried. The project name details are stored in the remote server. In the explain plan of the query, the WHERE clause has been pushed down to the remote server. This makes sure that only the data required to realize the result of the query is retrieved from the remote server and not the entire table. This is one of the optimizations that FDW offers that essentially reduces the data transferred over the network as well.

SELECT * FROM salary s INNER JOIN sqlservertab c ON s.emailid=c.emailid WHERE c.projectname LIKE 'Customer1.etl';
NOTICE: tds_fdw: Query executed correctly

NOTICE: tds_fdw: Getting results

emailid | salary | id | name | emailid | projectname | contactnumber
---------------------+--------+----+-------+---------------------+---------------+---------------
pqr@example.com | 100000 | 2 | pqr | pqr@example.com | Customer1.etl | 2328909432
EXPLAIN VERBOSE SELECT * FROM salary s INNER JOIN sqlservertab c ON s.emailid=c.emailid WHERE c.projectname LIKE 'Customer1.etl';

NOTICE: tds_fdw: Query executed correctly

NOTICE: tds_fdw: Getting results

QUERY PLAN
-------------------------------------------------------------------------------------------
Hash Join (cost=200.02..227.54 rows=6 width=144)
Output: s.emailid, s.salary, c.id, c.name, c.emailid, c.projectname, c.contactnumber
Hash Cond: ((s.emailid)::text = (c.emailid)::text)
-> Seq Scan on public.salary s (cost=0.00..22.70 rows=1270 width=36)
Output: s.emailid, s.salary
-> Hash (cost=200.01..200.01 rows=1 width=100)
Output: c.id, c.name, c.emailid, c.projectname, c.contactnumber
-> Foreign Scan on public.sqlservertab c (cost=200.00..200.01 rows=1 width=100)
Output: c.id, c.name, c.emailid, c.projectname, c.contactnumber

You can pull data from different database engines by using the appropriate FDW extension, as seen in the preceding examples. Because the data is pulled in real time from a foreign database, you can achieve this without needing to set up any kind of replication or reserving space in the source database.

Additional capabilities

In addition to SELECT, mysql_fdw and oracle_fdw allows you to run UPDATE, INSERT, and DELETE on a foreign table.

For a target MySQL foreign server, see the following script:

EXPLAIN UPDATE mysqltab SET name='def' WHERE name='xyz';

QUERY PLAN
---------------------------------------------------------------------------
Update on mysqltab (cost=25.00..1025.00 rows=1000 width=112)
-> Foreign Scan on mysqltab (cost=25.00..1025.00 rows=1000 width=112)
Remote server startup cost: 25
UPDATE mysqltab SET name='def' WHERE name='xyz';

SELECT * FROM mysqltab ;

id | name | emailid | projectname | contactnumber
----+-------+---------------------+---------------------+---------------
1 | abc | abc@example.com | Customer1.migration | 328909432

2 | pqr | pqr@example.com | Customer1.etl | 2328909432

3 | def | xyz@example.com | Customer1.infra | 328909432

Note that you receive an error if a column with the UNIQUE constraint isn’t defined at the remote MySQL server.

For a target Oracle foreign server, see the following script:

EXPLAIN UPDATE oracletab SET name='xyz' WHERE name='def';

ERROR: no primary key column specified for foreign Oracle table

DETAIL: For UPDATE or DELETE, at least one foreign table column must be marked as primary key column.

HINT: Set the option "key" on the columns that belong to the primary key.

//Oracle foreign tables need to be defined with foreign key in order for the update, inserts to work.

DROP FOREIGN TABLE oracletab ;

CREATE FOREIGN TABLE oracletab ( id int OPTIONS(key 'true') NOT NULL, name varchar, emailid varchar, projectname varchar, contactnumber bigint) SERVER oradb OPTIONS (table 'CUSTOMER1');

SELECT * FROM oracletab;

id | name | emailid | projectname | contactnumber
----+-------+---------------------+---------------------+---------------
1 | abc | abc@example.com | Customer1.migration | 328909432

2 | pqr | pqr@example.com | Customer1.etl | 2328909432

3 | xyz | xyz@example.com | Customer1.infra | 328909432
EXPLAIN UPDATE oracletab SET name='xyz' WHERE name='def';

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on oracletab (cost=10000.00..20000.00 rows=1000 width=112)
Oracle statement: UPDATE "CUSTOMER1" SET "NAME" = :p2 WHERE "ID" = :k1
-> Foreign Scan on oracletab (cost=10000.00..20000.00 rows=1000 width=112)
Oracle query: SELECT /*4ca441b6af055f008c188ebd629c49da*/ r1."ID", r1."NAME", r1."EMAILID", r1."PROJECTNAME", r1."CONTACTNUMBER" FROM "CUSTOMER1" r1 WHERE (r1."NAME" = 'def') FOR UPDATE
UPDATE oracletab SET name='xyz' WHERE name='def';

SELECT * FROM oracletab;

id | name | emailid | projectname | contactnumber
----+-------+---------------------+---------------------+---------------
1 | abc | abc@example.com | Customer1.migration | 328909432

2 | pqr | pqr@example.com | Customer1.etl | 2328909432

3 | xyz | xyz@example.com | Customer1.infra | 328909432

For a SQL Server foreign server or tds_fdw, you can only do SELECT operations on foreign server. If you attempt to run an insert or update on the foreign table, it fails with the following error:

INSERT INTO sqlservertab VALUES (4,'uvw','uvw@example.com','Customer1.finance',3929299);

ERROR: cannot insert into foreign table “sqlservertab”

UPDATE sqlservertab SET name='xyz' WHERE name='def';
ERROR: DB-Library error: DB #: 20018, DB Msg: General SQL Server error: Check messages from the SQL Server, OS #: -1, OS Msg: , Level: 15

FDWs come with optimizations, such as maintaining the table schema locally. As a result, rather than doing a SELECT * to pull all the data from a table in remote server, the FDW sends query WHERE clauses to the remote server to run, and doesn’t retrieve table columns that aren’t needed for the current query. It can also push down joins, aggregates, sorts, and more.

Clean up

You can follow the steps in this section to clean up the resources created in previous steps:

For a target MySQL foreign server, see the following script:

  1. Drop the foreign table:
    DROP FOREIGN TABLE mysqltab;
  2. Drop the user mapping that maps the database user to the foreign server user:
    DROP USER MAPPING FOR pgfdwsource SERVER mysqldb;
  3. Drop the foreign server definition that provides the local PostgreSQL server with the foreign (remote) server connection information:
    DROP SERVER mysqldb;
  4. Drop the extension:
    DROP EXTENSION mysql_fdw;

For a target Oracle foreign server, see the following script:

  1. Drop the foreign table:
    DROP FOREIGN TABLE oracletab;
  2. Drop the user mapping that maps the database user to the foreign server user:
    DROP USER MAPPING FOR pgfdwsource SERVER oradb;
  3. Drop the foreign server definition that provides the local PostgreSQL server with the foreign (remote) server connection information:
    DROP SERVER oradb;
  4. Drop the extension:
    DROP EXTENSION oracle_fdw;

For a target SQL Server foreign server, see the following script:

  1. Drop the foreign table:
    DROP FOREIGN TABLE sqlservertab;
  2. Drop the user mapping that maps the database user to the foreign server user:
    DROP USER MAPPING FOR pgfdwsource SERVER sqlserverdb;
  3. Drop the foreign server definition that provides the local PostgreSQL server with the foreign (remote) server connection information:
    pgfdwsource=> DROP SERVER sqlserverdb;
  4. Drop the extension:
    DROP EXTENSION tds_fdw;

If you do not need to use the Amazon Aurora and Amazon RDS instances that you created for these examples, we recommend that you delete them.

Summary

The FDWs discussed in this post are quite powerful; this post covers just the basic functionality. For more information about these optimizations, see the following GitHub repos:

This two-part series showed what you can achieve with federated query capabilities and the foreign data wrappers postgres_fdw, mysql_fdw, oracle_fdw, and tds_fdw in Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-compatible edition to query data from a PostgreSQL-based, MySQL, Oracle, or SQL Server remote server, both managed and self-managed on Amazon EC2 or on premises. Query federation is a powerful functionality that can be used in real time for running analytics queries on external databases, migrating data from external database to PostgreSQL or for developing basic ETL pipelines.

If you have questions or suggestions on the content covered in this post, leave them in the comments section.


About the Authors


Vibhu Pareek
is a Solutions Architect at AWS. He joined AWS in 2016 and specializes in providing guidance on cloud adoption through the implementation of well architected, repeatable patterns and solutions that drive customer innovation. He has keen interest in open source databases like PostgreSQL. In his free time, you’d find him enjoying a game of football.


Asmita Varma
is a Technical Account Manager with Amazon Web Services. She helps customers navigate the challenges and complexities of the AWS Cloud to attain operational excellence. Her focus area has been Databases.


Chirag Tyagi
is a Senior Solutions Architect at amazon web services. He’s passionate about helping customers achieve their business objectives with focus on Databases and Analytics.