Creating serverless GraphQL APIs from RDS databases with AWS AppSync and PostGraphile

GraphQL is a query language for APIs that provides an understandable description of the data in your API, and that allows clients to ask for data in the shape that they need it. GraphQL helps developers implement applications faster, by giving developers the ability to query multiple databases, microservices, and APIs with a single GraphQL API endpoint. As a serverless service that scales on demand, AWS AppSync makes it easy for developers to create production-ready GraphQL APIs that can connect to many different types of datasources in their AWS account, such as PostgreSQL databases.

PostgreSQL is one of the most popular SQL databases and is used to power many applications. Developers often want to expose their data to their applications, without making their database publicly available. The Backend For Frontend (BFF) pattern is often used to deploy application-specific APIs that allow clients to access the data that they need. GraphQL APIs can be used to make the right data available to apps in the necessary format. In a previous post, we had shown how to create an AWS AppSync API that uses an existing SQL database as a datasource by leveraging AWS Lambda functions and Amazon RDS Proxy. However, generating a schema from an existing database is a complicated process that is hard to solve manually.

In this post, we present a solution that leverages PostGraphile to automatically generate an AppSync compliant schema from PostgreSQL tables, and uses Lambda functions to resolve GraphQL queries against a PostgreSQL database in Amazon Relational Database Service (Amazon RDS). The solution is serverless, and it can be deployed in a few clicks. It uses the AWS Cloud Development Kit (AWS CDK), doesn’t require writing any code, supports subscriptions, and works with any PostgreSQL database, such as Amazon RDS for PostgreSQL and Amazon Aurora PostgreSQL.

Solution overview

Visit the GitHub repo to deploy the solution.

Diagram provides an overview of the deployed solution.

The diagram above provides an overview of the deployed solution. It consists of one AWS AppSync API and two Lambda functions: a provider function, and a resolver function. Both functions use enhanced VPC networking to connect to the RDS Proxy. The provider function is responsible for generating the GraphQL schema, updating the AppSync API, and generating a Lambda layer with the cached schema that is attached to the resolver function. The resolver function is a generic function that resolves GraphQL operations from AppSync. The resolver function behavior depends on the cached schema in its attached Lambda layer.

Both functions leverage PostGraphile to interact with the database. PostGraphile is an open source tool that can automatically detect tables, columns, indexes, relationships, views, types, functions, comments, and more. It can generate a schema from a PostgreSQL database and resolve GraphQL queries.

Here is how the solution works:

Start by deploying the CDK-based solution. The solution creates an AWS AppSync API with a datasource that uses the resolver Lambda function, and an AppSync function that uses that datasource.
Once the solution is deployed, a user runs the provider function to analyze the RDS PostgreSQL database and generate the GraphQL schema.
The provider function retrieves schema information from RDS database using PostGraphile.
The provider updates the Layer function attached to the resolver Lambda function and updates the AWS AppSync API. It updates the schema, and properly sets up the queries, mutations, and subscriptions. Note that a user can repeat Step 1 at any time (e.g., after a database schema change) to update the AWS AppSync API definition.
The AWS AppSync API is now ready to process requests. A GraphQL request is made.
AWS AppSync authorizes the request using the configured Authorization Mode (API KEY, Cognito User Pool, etc.).
AWS AppSync resolves the request by calling the attached Direct Lambda Resolver. The identity of the user is included in the request to the resolver Lambda function.
The Lambda function resolves the query using the PostGraphile schema and RDS database.

Getting started

To get started, we need the following to enable connections to our database:

an RDS Postgres database
an RDS Proxy associated with our RDS Postgres database
- we use AWS Identity and Access Management (IAM) authentication for databases
- and we securely store credentials in AWS Secrets Manager.
at least one private subnet with an NAT that our Lambda function ENIs will be deployed into, and a VPC Security Group
- this security group must be an allowed source of traffic for our RDS Proxy security group

We also must know the following information about our Postgres database:

database to connect to
schema(s) of interest (containing our tables and functions)
username/role to use to execute queries. This role should have the scoped-down privileges required to access the schema(s). See this AWS post for more details on security best practices for Amazon RDS for PostgreSQL. The provider uses the postgres role for configuration. The resolver uses our provided username/role to run queries.

We can deploy the solution by visiting this GitHub repo. The repo also provides a helper CDK app that we can use to deploy a sample VPC configured with a sample RDS PostgreSQL database, RDS Proxy, and sample data.

To deploy the solution, we run the following commands after updating with our account values.

git clone https://github.com/aws-samples/appsync-with-postgraphile-rds.git
cd appsync-with-postgraphile-rds
npm install

cd ./appsync-with-postgraphile

REGION="<region>"
RDSPROXYNAME="<rdsproxyname>"
SECURITYGROUPID="<security-group-id>"
USERNAME="<username>"
DATABASE="<database>"
SCHEMAS="<schemas>"

npm run deploy -- --region $REGION --proxy $RDSPROXYNAME --sg $SECURITYGROUPID --username $USERNAME --database $DATABASE --schemas $SCHEMAS

Then, from the appsync-with-postgraphile directory, we run the update command to update our AWS AppSync API with the latest database schema definition.

npm run update

Interacting with the API

To showcase the functionality, let’s use a deployed solution generated from the sample RDS schema provided in the GitHub repo. The schema defines a person table and a post table (partially shown as follows). A person can own multiple posts.

create table forum_example.person (
  id               serial primary key,
  username         text unique,
  first_name       text not null check (char_length(first_name) < 80),
  last_name        text check (char_length(last_name) < 80),
  about            text,
  created_at       timestamp default now()
);

create table forum_example.post (
  id               serial primary key,
  author_username  text not null references forum_example.person(username),
  headline         text not null check (char_length(headline) < 280),
  body             text,
  topic            forum_example.post_topic,
  created_at       timestamp default now()
);

comment on constraint "post_author_username_fkey" on forum_example.post is E'@foreignFieldName posts\n@fieldName author';

Once deployed, we can interact with the API directly from the AWS AppSync Console. The generated API lets us interact with the different data types. With PostGraphile, relationships between tables are automatically discovered. For example, we can query for people, and we can get the list of posts associated with each person (through the reference on author_username) in a single query. The resolver Lambda functions uses the PostGraphile library to execute an optimized SQL query against our database to resolve the GraphQL query.

Querying for people and their posts using the Queries editor.

Leveraging row-level security

PostgreSQL has strong row-level security policies that we can utilize with PostGraphile and AWS AppSync. In the solution, when a GraphQL query is made, the identity object from AWS AppSync’s $context variable is passed to PostGraphile, which is then made available through current_setting(...) within PostgreSQL. Then, the identity values can be leveraged by row level policies.

We can leverage this in our solution by changing the default authorization mode of the API from API_KEY to AMAZON_COGNITO_USER_POOLS, and specifying the Cognito User Pool to use for authorization. We can do this by visiting the Settings page of our API in the AppSync console. Our demo person table is preloaded with a person with the username “johndoe”. We create a “johndoe” user in our Cognito User Pool and log in as that user on the AppSync console Queries page. Now, when we make a request, the following information is made available to PostgreSQL current session:

{
  "appsync.identity_claims_sub": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "appsync.identity_claims_email_verified": true,
  "appsync.identity_claims_iss": "https://cognito-idp.us-east-2.amazonaws.com/us-east-2_xxxxxxxxx",
  "appsync.identity_claims_cognito:username": "johndoe",
  "appsync.identity_claims_origin_jti": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "appsync.identity_claims_aud": "xxxxxxxxxxxxxxxxxxxxxxxxxx",
  "appsync.identity_claims_event_id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "appsync.identity_claims_token_use": "id",
  "appsync.identity_claims_auth_time": 1655168739,
  "appsync.identity_claims_exp": 1655172339,
  "appsync.identity_claims_iat": 1655168739,
  "appsync.identity_claims_jti": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "appsync.identity_claims_email": "xxxxxxxxxxxxxxxxxxxxx",
  "appsync.identity_defaultAuthStrategy": "ALLOW",
  "appsync.identity_issuer": "https://cognito-idp.us-east-2.amazonaws.com/us-east-2_xxxxxxxxx",
  "appsync.identity_sourceIp_0": "xx.xxx.xx.xxx",
  "appsync.identity_sub": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
  "appsync.identity_username": "johndoe"
}

In our schema, the following policy restricts updates to rows in the person table.

alter table forum_example.person enable row level security;

create policy update_person on forum_example.person for update to forum_example_person
  using (username = current_setting('appsync.identity_username')::text);
create policy select_person on forum_example.person for select using (true);

Now, when we try to update a person using the updatePerson mutation, the mutation is only allowed if the requester’s username (appsync.identity_username) matches the username value of the row that is being edited.

We can also leverage identity to implement custom functions. For example, the current_person function uses the appsync.identity_username setting to retrieve the information in the person table about the user making the request.

create function forum_example.current_person() returns forum_example.person as $$
  select *
  from forum_example.person
  where username = current_setting('appsync.identity_username')::text
$$ language sql stable;

The current_person function is available as a query in the schema as the currentPerson query field.

Querying for the current person using the Queries editor

Subscriptions and Pub/Sub features

One of the more challenging GraphQL features to implement is subscriptions. AWS AppSync provides support for subscriptions out of the box, so there’s no specific implementation required to get Pub/Sub features from the API. Each generated subscriptions supports a filter argument that clients can use to filter subscriptions using AppSync’s Enhanced Subscription Filtering.

The following is an example of filtering subscriptions where the value of the about field begins with “about”.

Receiving a notification via subscription in the Queries editor

With Enhanced Subscription Filtering, we can also add filtering on the server side. For example, we can filter subscriptions to restrict access based on the identity of the user making the request by updating the onUpdatePerson response mapping template.

$extensions.setSubscriptionFilter({
  "filterGroup": [
    {
      "filters": [
        {
          "fieldName": "username",
          "operator": "eq",
          "value": $ctx.identity.username,
        },
      ],
    },
  ],
})
$util.toJson(null)

Conclusion

AWS AppSync is a service that allows developers to build scalable, serverless GraphQL APIs. AWS AppSync offers automatic Pub/Sub functionality with subscriptions that we can use to implement real-time features in our applications. AWS AppSync’s strong authorization and security feature allows developers to easily authorize requests and implement rules to restrict data access to the right identities. AWS AppSync’s integrated Query editor in the AWS AppSync Console makes it simple to interact and test APIs. By combining AWS AppSync with PostGraphile, developers can unlock the data that lives in their RDS PostgreSQL databases and make it accessible to their application without having to write schemas manually or implement custom business logic. This solution removes the heavy-duty work that is often needed to create custom APIs for databases. Furthermore, it allows developers to get started with a serverless AWS AppSync GraphQL API in a manner of minutes. Get started today by visiting the solution in the GitHub repo.