Front-End Web & Mobile

Integrating GraphQL and SQL with AWS AppSync, AWS Lambda and Amazon RDS Proxy

This article was written by Eduardo Rabelo, Solutions Architect, AWS

Organizations choose to build APIs because it helps them develop secure applications faster as well as abstract and protect their backend services and databases. GraphQL is an API technology that empowers developers to query multiple databases, microservices, and APIs with a single endpoint in a single call. AWS AppSync is a fully managed serverless service that makes it easy to develop GraphQL APIs in the cloud. It handles the heavy lifting of managing a complex GraphQL backend infrastructure, securely connecting to data sources, adding caches to improve performance, subscriptions to support real-time updates, and client-side data stores that keep offline clients in sync.

GraphQL provides a flexible typed data query language for APIs as well as a runtime to fulfill these queries. If you’re trying to access SQL backends from your application, it can be challenging to implement SQL queries on the client side. With GraphQL, developers can take advantage of flexible querying capabilities with no need to write a single SQL query directly in the front-end. Developers are free from hardcoded SQL statements and confusing string manipulation. GraphQL helps to increase flexibility and agility in your business and development lifecycle.

SQL databases are the best choice when there’s a requirement to access relational data. However many applications, including those built on modern and scalable serverless architectures using AWS Lambda, AWS Fargate, Amazon ECS, or Amazon EKS, can start a large number of connections to the SQL database, opening and closing database connections at a high rate, thus exhausting database memory and compute resources. The Amazon RDS Proxy is a fully managed and highly available database proxy for the Amazon Relational Database Service (RDS), allowing applications to pool and share connections established with the database, improving database efficiency and application scalability. The RDS Proxy helps making applications more scalable, secure, and resilient to database failures.

With AppSync and RDS Proxy we can build a flexible modern serverless application architecture backed by a SQL database and fully managed by AWS. Leveraging the GraphQL info object available in AppSync resolvers we can implement fine-tuned business logic for GraphQL queries. It allows to map the correct data sources to the requested fields with a single resolver by inspecting the detailed information about the query itself. In this article, we show you how to build a flexible and scalable GraphQL API backend leveraging AppSync, Lambda, and the RDS Proxy to connect to a PostgreSQL database in RDS:

 

 

 

Setting up a RDS PostgreSQL database

Before we start there are some requirements you need to have already in place in your AWS account:

  • a VPC and private subnets for the RDS Database and Proxy
  • a security group allowing  access to TCP port 5432 for PostgreSQL
  • a NAT Gateway in place in a public subnet
  • the RDS instance must be configured with the PostgreSQL 11.10-R1 engine in the RDS Console

You can find more information on how to create and connect to a PostgreSQL database on RDS following this tutorial.

After the RDS database is in place with the required VPC setup, we execute the following SQL commands to create tables and populate them with sample data using your SQL client of choice:

CREATE TABLE Parking (
  id varchar(255) primary key,
  name varchar(80),
  price int,
  car_id varchar(255)
);
CREATE TABLE Car (
  id varchar(255) primary key,
  color varchar(80),
  parking_id varchar(255)
);
ALTER TABLE Parking ADD CONSTRAINT fk_car_id
FOREIGN KEY (car_id) REFERENCES Car(id);
ALTER TABLE Car ADD CONSTRAINT fk_parking_id
FOREIGN KEY (parking_id) REFERENCES Parking(id);
INSERT INTO Car (id, color) VALUES('CAR00', 'Blue');
INSERT INTO Car (id, color) VALUES ('CAR01', 'Red');
INSERT INTO Car (id, color) VALUES ('CAR02', 'Orange');
INSERT INTO Parking (id, name, price) VALUES('PRK00', 'East', 100);
INSERT INTO Parking (id, name, price) VALUES ('PRK01', 'West', 200);
INSERT INTO Parking (id, name, price) VALUES ('PRK02', 'North', 300);
UPDATE Car SET parking_id = 'PRK00' WHERE id = 'CAR00';
UPDATE Car SET parking_id = 'PRK01' WHERE id = 'CAR01';
UPDATE Car SET parking_id = 'PRK02' WHERE id = 'CAR02';
UPDATE Parking SET car_id = 'CAR00' WHERE id = 'PRK00';
UPDATE Parking SET car_id = 'CAR01' WHERE id = 'PRK01';
UPDATE Parking SET car_id = 'CAR02' WHERE id = 'PRK02';

 

RDS Proxy pre-requisites

The RDS Proxy requires:

  • A secret in AWS Secrets Manager with the RDS instance credentials
  • An IAM role with permissions to read the secret
  • Access to TCP port 5432

In order to create a secret, go to the AWS Secrets Manager Console and click “Store a new secret“. Select “Credentials for RDS database“, providing the user name and password defined in the previous step in addition to the database details. Take note of the secret’s ARN, we need it for the IAM role we create next.

We need a new IAM role with permissions to read our secret. In the AWS IAM Console create a role with the following policy:

  {
    "Version": "2012-10-17",
    "Statement": [
      {
        "Effect": "Allow",
        "Action": "secretsmanager:GetSecretValue",
        "Resource": [
          "YOUR_SECRETS_ARN"
        ]
      }
    ]
  }

 

As far as networking is concerned, as mentioned before, a VPC security group needs to allow access to the appropriate port (TCP 5432) in the database’s subnet.

 

RDS Proxy creation

We are now ready to create our Amazon RDS Proxy! Back to the RDS Console, select the “Proxies” section then click “Create proxy“.

In the “Proxy Configuration” section, define a name for your Proxy, select PostgreSQL as engine, and enable “Require Transport Layer Security“. Select the database previously created as well as the Secrets Manager secret, the IAM role, and appropriate VPC subnets and security groups in the next sections.

For step by step instructions on how to configure the RDS Proxy, refer to the launch blog post or this detailed tutorial.

The new RDS Proxy will be ready a few minutes later. Take note of the RDS Proxy endpoint URL.

 

GraphQL API in AppSync

Next we create an AppSync API called graphql2sql. In the AppSync Console, click “Create API“. Next in the “Getting Started” page, select “Build from scratch” and click “Start“.

 

After the API is created, select “Schema” in the sidebar and paste the following GraphQL schema:

schema {
    query: Query
    mutation: Mutation
}

type Car {
    id: ID!
    color: String!
    parking: Parking
}

type Parking {
    id: ID!
    name: String!
    car: Car!
    price: Int!
}

input InputCreateCar {
    color: String!
}

type Query {
    getParking(id: ID!): Parking
}

type Mutation {
    createCar(input: InputCreateCar): Car
}

 

There’s a 1:1 relationship between the Parking and the Car types and a 1:1 relationship between the Car and the Parking types. It means that a car has a parking spot associated with it and vice versa. It reflects the relationships defined in our SQL tables.

By default API access is authorized with API Keys, you can customize authorization and access with different available authorization modes.

Now we have our serverless GraphQL API, the next step is to configure a data source. We leverage Direct Lambda Resolvers in AWS AppSync. Before we do so we need to create a Lambda function first.

 

AWS Lambda setup

In the Lambda Console, we create a blank Node.js 14.x function with basic permissions called getParking to connect to the RDS Proxy. The function must be able to access the same VPC subnets and security groups as the RDS instance. Before creating the function configure the appropriate VPC access accordingly under “Advanced Settings” in the “Create function” page . In order to connect to the RDS Proxy, the function will need to be in a security group with access to the TCP port 5432. Lambda also needs internet access to securely call the Secrets Manager regional endpoints, which means the VPC needs to be configured with a NAT Gateway.

Now we need to enable our lambda function to fetch the credentials to connect to the RDS proxy. In the function “Permissions” tab, under the “Execution role” section, click the “Role name” to be directed to the role settings in IAM. Click on “Add inline policy” on the right side and, in the “Crate policy” page, add a policy as follows:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "secretsmanager:GetSecretValue",
      "Resource": [
        "YOUR_SECRETS_ARN"
      ]
    }
  ]
}

 

Make sure that the secret ARN is replaced accordingly. You should see the new inline policy attached to the IAM role:

 

Back to the Lambda console, we start with the following code snippet:

let AWS = require("aws-sdk");
let postgres = require("postgres");

let secretsManager = new AWS.SecretsManager();

let { SM_EXAMPLE_DATABASE_CREDENTIALS, URL_RDS_PROXY } = process.env;

async function handler(event) {
  let sm = await secretsManager.getSecretValue({ SecretId: SM_EXAMPLE_DATABASE_CREDENTIALS }).promise();
  let credentials = JSON.parse(sm.SecretString);

  let connectionConfig = {
    host: URL_RDS_PROXY,
    port: credentials.port,
    username: credentials.username,
    password: credentials.password,
    database: credentials.dbname,
    ssl: true,
  };

  let sql = postgres(connectionConfig);
  let payload = {};

  // ... more code

  await sql.end({ timeout: 0 });

  return payload;
}

module.exports = { handler };

 

Add two environment variables to the function:

  • SM_EXAMPLE_DATABASE_CREDENTIALS with the secret name/ID in Secrets Manager as value and,
  • URL_RDS_PROXY with the RDS Proxy address

We are using Node.js as the runtime for our Lamba function, fetching Amazon RDS credentials from AWS Secrets Manager and connecting to the Amazon RDS Proxy (e.g. URL_RDS_PROXY) using the npm package postgres over a TLS connection required by the proxy (e.g. ssl: true).

The payload object returns all fields and values required by our query.

In the following sections we walk through the Lambda function code, breaking it down into smaller code snippets to focus on specific business logic. The full code is provided at the end, you can follow the instructions in our documentation to package the NPM dependencies and deploy the function accordingly.

 

The anatomy of an AppSync GraphQL Query to Lambda

The following GraphQL query from AppSync:

query getParking {
  getParking(id: "PRK00") {
    id
    name
    car {
      id
      color
      parking {
        id
        name
        price
      }
    }
  }
}

 

Generates an event sent from AppSync to the Lambda resolver:

{
  "arguments": {
    "id": "PRK00"
  },
  "identity": null,
  "source": null,
  "request": {},
  "prev": null,
  "info": {
    "selectionSetList": [
      "id",
      "name",
      "car",
      "car/id",
      "car/color",
      "car/parking",
      "car/parking/id",
      "car/parking/name",
      "car/parking/price"
    ],
    "selectionSetGraphQL": "...",
    "parentTypeName": "Query",
    "fieldName": "getParking",
    "variables": {}
  },
  "stash": {}
}

 

With that information in hand, we can read the parking record using event.arguments.id:

// ...
async function handler(event) {
  // ...

  let sql = postgres(connectionConfig);
  let payload = {};

  let parkingColumns = event.info.selectionSetList.filter((item) => !item.startsWith("car"));
  let [parking] = await sql`SELECT ${sql(parkingColumns)} FROM Parking WHERE id = ${event.arguments.id}`;

  payload = { ...parking };

  // ... more code

  await sql.end({ timeout: 0 });

  return payload;
}

module.exports = { handler };

 

We know that the car field in the type Parking is related to a record of a different type. In parkingColumns we are excluding all car-related fields, building a performant SQL query without using a * wildcard. Now let’s do the same for the car record:

// ...
async function handler(event) {
  // ...

  let sql = postgres(connectionConfig);
  let payload = {};

  let parkingColumns = event.info.selectionSetList.filter((item) => !item.startsWith("car"));
  let [parking] = await sql`SELECT ${sql(parkingColumns)} FROM Parking WHERE id = ${event.arguments.id}`;

  payload = { ...parking };

  if (event.info.selectionSetList.some((item) => item === "car")) {
    let carColumns = event.info.selectionSetList.filter((item) => item.startsWith("car/") && !item.includes("parking")).map((item) => item.split("/")[1]);
    let [car] = await sql`SELECT ${sql(carColumns)} FROM Car WHERE parking_id = ${parking.id}`;

    payload.car = { ...car };
  }

  // ... more code

  await sql.end({ timeout: 0 });
  
  return payload;
}

module.exports = { handler };

 

First, we check if there’s a nested field car in the GraphQL selection set in the selectionSetList array. If it exists, we filter all car-related fields in carColumns, excluding the nested type parking within car.

With the primary records parking and car retrieved from the database, let’s tackle the recursive car/parking query.

// ...
async function handler(event) {
  // ...

  let sql = postgres(connectionConfig);
  let payload = {};

  let parkingColumns = event.info.selectionSetList.filter((item) => !item.startsWith("car"));
  let [parking] = await sql`SELECT ${sql(parkingColumns)} FROM Parking WHERE id = ${event.arguments.id}`;

  payload = { ...parking };

  if (event.info.selectionSetList.some((item) => item === "car")) {
    let carColumns = event.info.selectionSetList.filter((item) => item.startsWith("car/") && !item.includes("parking")).map((item) => item.split("/")[1]);
    let [car] = await sql`SELECT ${sql(carColumns)} FROM Car WHERE parking_id = ${parking.id}`;

    payload.car = { ...car };
  }

  if (event.info.selectionSetList.some((item) => item === "car/parking")) {
    let carParkingColumns = event.info.selectionSetList.filter((item) => item.startsWith("car/parking/")).map((item) => item.split("/")[2]);

    if (carParkingColumns.every((col) => parking[col])) {
      payload.car.parking = {};
      carParkingColumns.forEach((col) => {
        payload.car.parking[col] = parking[col];
      });
    } else {
      let parkingExtraColumns = carParkingColumns.filter((item) => !parkingColumns.includes(item));
      let [parkingExtraFields] = await sql`SELECT ${sql(parkingExtraColumns)} FROM Parking WHERE id = ${event.arguments.id}`;
      payload.car.parking = { ...parking, ...parkingExtraFields };
    }
  }

  await sql.end({ timeout: 0 });

  return payload;
}

module.exports = { handler };

 

Once again, we check if there’s a nested field car/parking in the selectionSetList array. If it exists, we filter all car-parking-related fields in carParkingColumns.

Before reading anything from the database, we check if all carParkingColumns fields exist in the retrieved parking record. If they are present in the variable, we copy their values to the nested object payload.car.parking, which saves a round trip to the database.

Otherwise, we filter which fields are not present in the parkingColumns variable in the parkingExtraColumns array, making a tidy SQL query with only the missing fields instead of the whole record. In the example above, parkingExtraColumns is equal to the car-parking nested field price.

 

Creating data with GraphQL Mutations

With the info object support in AppSync, we can inspect what kind of GraphQL operation clients are requesting and execute it accordingly.

In order to create data, let’s update our Lambda code to support GraphQL mutation requests:

// ...
async function handler(event) {
  // ...

  let sql = postgres(connectionConfig);

  if(event.info.parentTypeName === 'Mutation') {
    let newCar = event.arguments.input;
    newCar.id = `CAR${Date.now()}${Math.random().toString(16).slice(2)}`;
    let [car] = await sql`INSERT INTO Car ${sql(newCar)} RETURNING ${sql(Object.keys(newCar))}`;
    return car;
  }

  // ...
}

module.exports = { handler };

 

Our serverless function is ready, it can handle recursive read requests in addition to writing data to the backend SQL database.

You can check the full version of our AWS Lambda code below:

let AWS = require("aws-sdk");
let postgres = require("postgres");

let secretsManager = new AWS.SecretsManager();

let { SM_EXAMPLE_DATABASE_CREDENTIALS, URL_RDS_PROXY } = process.env;

async function handler(event) {
  let sm = await secretsManager.getSecretValue({ SecretId: SM_EXAMPLE_DATABASE_CREDENTIALS }).promise();
  let credentials = JSON.parse(sm.SecretString);

  let connectionConfig = {
    host: URL_RDS_PROXY,
    port: credentials.port,
    username: credentials.username,
    password: credentials.password,
    database: credentials.dbname,
    ssl: true,
  };

  let sql = postgres(connectionConfig);

  if (event.info.parentTypeName === "Mutation") {
    let newCar = event.arguments.input;
    newCar.id = `CAR${Date.now()}${Math.random().toString(16).slice(2)}`;
    let [car] = await sql`INSERT INTO Car ${sql(newCar)} RETURNING ${sql(Object.keys(newCar))}`;
    return car;
  }

  let payload = {};

  let parkingColumns = event.info.selectionSetList.filter((item) => !item.startsWith("car"));
  let [parking] = await sql`SELECT ${sql(parkingColumns)} FROM Parking WHERE id = ${event.arguments.id}`;

  payload = { ...parking };

  if (event.info.selectionSetList.some((item) => item === "car")) {
    let carColumns = event.info.selectionSetList.filter((item) => item.startsWith("car/") && !item.includes("parking")).map((item) => item.split("/")[1]);
    let [car] = await sql`SELECT ${sql(carColumns)} FROM Car WHERE parking_id = ${parking.id}`;

    payload.car = { ...car };
  }

  if (event.info.selectionSetList.some((item) => item === "car/parking")) {
    let carParkingColumns = event.info.selectionSetList.filter((item) => item.startsWith("car/parking/")).map((item) => item.split("/")[2]);

    if (carParkingColumns.every((col) => parking[col])) {
      payload.car.parking = {};
      carParkingColumns.forEach((col) => {
        payload.car.parking[col] = parking[col];
      });
    } else {
      let parkingExtraColumns = carParkingColumns.filter((item) => !parkingColumns.includes(item));
      let [parkingExtraFields] = await sql`SELECT ${sql(parkingExtraColumns)} FROM Parking WHERE id = ${event.arguments.id}`;
      payload.car.parking = { ...parking, ...parkingExtraFields };
    }
  }

  await sql.end({ timeout: 0 });

  return payload;
}

 

In order to deploy the function, create a folder called graphql2sql in your local computer and create a new file called index.js with the function code above. Install the postgres module with NPM, zip the contents of the folder and upload the zip file to replace the existing code in Lambda.

 

Adding the Lambda Data Source in AppSync

Back to the AppSync Console, select the graphql2sql API, and click on “Data Sources” in the sidebar. We link our Lambda function as a data source:

 

Next we need to link the Lambda data source to the related GraphQL operations. Back to the sidebar, click “Schema” and type “Query” in the search bar. Next to the getParking query, click on “Attach”:

 

In the “Create new Resolver” page, select the Lambda data source and confirm request and response mapping templates are disabled. It tells AppSync to bypass VTL templates and call the Lambda function directly. Click “Save Resolver” to apply the changes:

 

Repeat the steps to attach the Lambda function to the mutation createCar. Next we execute a mutation in the AWS AppSync query editor to test we can successfully write data to our SQL backend:

 

 

Clean up

RDS Proxy and the RDS instance are billed per hour, and Secrets Manager secrets are billed per month. GraphQL APIs in AppSync and Lambda functions are charged only when invoked. Terminate resources you won’t use in the future to avoid incurring unplanned or unexpected costs.

 

Conclusion

AWS AppSync and Direct Lambda Resolvers provide a powerful combination for any serverless GraphQL APIs. They allow to handle complex queries without attaching additional resolvers to nested field types, helping to isolate business logic with your programming language of choice as well as improving the GraphQL query’s performance by reducing the number of resolver invocations.

Amazon RDS Proxy allows applications to pool and share database connections to improve their ability to scale. It is fully compatible with MySQL and PostgreSQL. You can enable RDS Proxy for most applications with no code changes. It scales automatically to accommodate virtually any workload.

With all the services we showcased in this article you don’t need to ditch SQL to work with serverless architectures. You can bring all your expertise in SQL, spin up an AppSync GraphQL API, take advantage of a typed data query language with GraphQL in a backend infrastructure fully managed by AWS.