AWS Open Source Blog

Using Apollo Server on AWS Lambda with Amazon EventBridge for real-time, event-driven streaming

GraphQL is an application-level query language that helps clients and servers communicate by establishing a common protocol for queries. It represents an alternative to the REST style: unlike REST, GraphQL gives the client, not the server, the power to define what kind of data will be included in the response to its query.

GraphQL allows clients to efficiently query multiple backend applications with a single endpoint. Additionally, GraphQL introduces the concept of subscriptions for real-time updates. With subscriptions, a client can tell a server which data it wants to get pushed, for example via a WebSocket connection as seen in this post.

The easiest way to get started with GraphQL is AWS AppSync, a fully managed service that handles the heavy lifting of securely connecting to data sources such as Amazon DynamoDB and AWS Lambda, making it easy to develop GraphQL APIs. Check out Simplify out of band AWS AppSync real-time subscriptions with Amazon EventBridge to learn more about event-driven real-time streaming with AWS AppSync.

While AppSync is a great way to get started, many of our customers also choose to run their own GraphQL API Layer. This post shows how to build a serverless event-driven streaming architecture for real-time client-server communication with Apollo Server, one of the most popular open-source GraphQL servers.

We use the AWS Cloud Development Kit (AWS CDK). This is an open-source software development framework to model and provision cloud application resources. Using the CDK can reduce the complexity and amount of code needed to automate the deployment of resources.

Overview of the solution

This solution uses Amazon API Gateway, AWS Lambda, Amazon DynamoDB, Amazon EventBridge, and Amazon Translate.

These services are part of the AWS serverless platform and help reduce undifferentiated work around managing servers, infrastructure, and the parts of the application that add less value to your customers. With serverless, the solution scales automatically, has built-in high availability, and you only pay for the resources you use.

The sample application of this blog post is a simple chat app that receives a text message from the client and responds with French and German translations of the message.

Flow chart showing how the Chat application communicates through the API Gateway via Websocket, over the Eventbus, and in and out of Amazon Translate.

Note: GraphQL is agnostic about the protocol that is used for the communication between client and server and does not require a REST API, but many GraphQL client libraries expect a RESTful API for mutation and query operations. The sample code thus deploys both types of APIs for convenience, but for the remainder of the post, we will focus on the WebSocket API Gateway and omit the REST API for brevity.

This diagram outlines the workflow implemented in this blog:

  1. Client C connects to the API Gateway via WebSocket connection.
  2. C creates a subscription with Apollo to receive new messages via server push. The details of the subscription are stored in Amazon DynamoDB as a (connectionId, chatId)-tuple.
  3. C sends a message to the API via GraphQL mutation.
  4. The RequestHandler Lambda function gets invoked by the respective API Gateway and publishes the message to the EventBridge event bus.
  5. EventBridge invokes two Lambda functions that use Amazon Translate to translate the user message into German and France. After translating, Lambda the functions then publish a new event that contains the translated message.
  6. EventBridge invokes the ResponseHandler Lambda function with the responses. ResponseHandler looks up the connection details in DynamoDB and pushes the messages to every connection that subscribed.
  7. If the user disconnects, the ConnectionHandler Lambda function removes the (connectionId, chatId)-tuple from DynamoDB.

Walkthrough

The following walkthrough explains the components and how the provisioning can be automated via CDK. For this walkthrough, you need:

To deploy the sample stack:

  1. Clone the associated GitHub repository by running the following command in a local directory:
    git clone https://github.com/aws-samples/apollo-server-on-aws-lambda
  2. Open the repository in your preferred local editor and review the contents of the src and cdk folder
  3. Follow the instructions in the README.md to deploy the stack. You will see an output like the one below. As mentioned earlier, the REST API is created for compatibility with common GraphQL clients, but will not be used in this blog post.
Outputs:

ApolloServerStack.RestApiEndpoint      = https://xxxxxxxxxx.execute-api.eu-west-1.amazonaws.com/dev/graphql
ApolloServerStack.WebsocketApiEndpoint = wss://yyyyyyyyyy.execute-api.eu-west-1.amazonaws.com/dev
  1. Establish a websocket connection with the WebsocketApiEndpoint
    1. Send the following message to subscribe to chat
{"query":"subscription Subscription {
chat(chatId: \"chat\")
}","variables":{}}
    1. Send the following message over the WebSocket. The backend will respond with the German and French translation of “Good morning”
{"query":"mutation postMessage {\n  putEvent(message: \"Good morning\", chatId: \"chat\") { Entries { EventId } }\n}","variables":{}}

A screenshot of the messages passing over EventBridge

The AWS CDK deploys a number of resources into your account that can be put into three logical groups: Client-facing API, Apollo Server and Event-driven message processors

Client-facing API

Amazon API Gateway is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the “front door” for applications to access data, business logic, or functionality from your backend services. Using API Gateway, you can create RESTful APIs and WebSocket APIs that enable real-time two-way communication applications.

The CDK deploys two different API Gateways that allow clients to communicate with the backend of the application via REST or WebSocket. As mentioned in the beginning, we focus on the WebSocket API to process all three GraphQL (Query, Mutation, Subscription) operations.

In API Gateway, you can create a WebSocket API as a stateful frontend for an AWS service, in our case AWS Lambda. AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes. With Lambda, you can run code for virtually any type of application or backend service—all with zero administration.

The WebSocket API invokes the backend based on the content of the messages it receives from client apps. There are three predefined routes that can be used: $connect, $disconnect, and $default.

$connect and $disconnect are used by clients to initiate or end a connection with the API Gateway. Each route has a backend integration that is invoked for the respective event. In this example, a Lambda function gets invoked with details of the event. To track each of the connected clients, the handler persists the connection identifiers in a DynamoDB table. Amazon DynamoDB is a fully managed, serverless, key-value NoSQL database designed to run high-performance applications at any scale.

export async function connectionHandler(event: APIGatewayEvent) {
    const { eventType, connectionId } = event.requestContext;

    if (eventType === "CONNECT") {
        // Handle the CONNECT event, e.g. store in DynamoDB
    }
    if (eventType === "DISCONNECT") {
        // Handle the DISCONNECT event, e.g. delete from DynamoDB
    }
    return
}

The $default route is used when the route selection expression produces a value that does not match any of the other route keys in your API routes. For this post, we use it as a default route for all messages sent to the API Gateway by a client. For each message, a Lambda function is invoked with an event of the following format.

{
    "requestContext": {
        "routeKey": "$default",
        "messageId": "GXLKJfX4FiACG1w=",
        "eventType": "MESSAGE",
        "messageDirection": "IN",
        "connectionId": "GXLKAfX1FiACG1w=",
        "apiId": "3m4dnp0wy4",
        "requestTimeEpoch": 1632812813588,
        // some fields omitted for brevity
        },
    "body": "{\"query\":\"subscription Subscription {\n  event\n}\n\",\"variables\":{}}",
    "isBase64Encoded": false
}

The invoked Lambda function then runs Apollo Server to process the GraphQL query in the body field. This sets off the asynchronous message processing flow that at the end pushes the translated text back to the client; we will explain later in detail how this works.

Apollo Server

The Apollo Server is deployed in a Lambda function via the apollo-server-lambda package. The package adds helper methods that allow you to create Lambda handlers from an Apollo Server as follows:

const { ApolloServer } = require("apollo-server-lambda");

const server = new ApolloServer({
    schema
})

const graphQlHandler = server.createHandler()

export async function handleMessage(event: any) {
    // Business logic
    const operation = JSON.parse(event.body.replace(/\n/g, ""));
    const graphqlDocument = parse(operation.query);
    const validationErrors = validate(schema, graphqlDocument);
    const isWsConnection: boolean = !event.resource;

    if (graphqlDocument.definitions[0].operation === "subscription") {
        if (!isWsConnection) {
            return generateLambdaProxyResponse(400, "Subscription not support via REST");
        }
        const connectionId = event.requestContext.connectionId;
        const chatId = graphqlDocument.definitions[0].selectionSet.selections[0].arguments[0].value.value;

        const oneHourFromNow = Math.round(Date.now() / 1000 + 3600);
        await dynamoDbClient.put({
            TableName: process.env.TABLE_NAME!,
            Item: {
                chatId: chatId,
                connectionId: connectionId,
                ttl: oneHourFromNow,
            },
        }).promise();

        return generateLambdaProxyResponse(200, "Ok");
    }

    return graphQlHandler(event)
}

The handler code first parses the body of the request to an Apollo representation of the GraphQL query. REST and WebSocket APIs use the same Lambda function as request handler, so the code also needs to inspect where the request originated from. This can be done by checking if event.resource exists, which is only set for REST APIs. Next, the function inspects graphqlDocument.definitions[0].operation to determine the requested operation.

For Mutation and Query operations, the event is passed on to graphQlHandler. graphQlHandler internally forwards the event to an instance of Apollo Server, where the request is processed as defined in the GraphQL schema definition.

type EventDetails {
    EventId: String
    ErrorMessage: String
    ErrorCode: String
  }

  type Mutation {
    putEvent(message: String!, chatId: String!): Result
  }

  type Query {
    getEvent: String
  }

  type Result {
    Entries: [EventDetails]
    FailedEntries: Int
}

  type Subscription {
    chat(chatId: String!): String
  }

  schema {
    query: Query
    mutation: Mutation
    subscription: Subscription
  }

Query

Per GraphQL specification, a Query operation needs to be defined in the schema. However, since messages are handled asynchronously, we only provide a dummy implementation for Query.

Query: {
    getEvent: () => "Hello from Apollo!",
}

Mutation

In GraphQL, mutations are used to modify servers-side data. In most implementations, mutations trigger synchronous operations that change data and return a result. Here, mutating data means to submit the message to an Amazon EventBridge event bus for further asynchronous processing. Amazon EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale using events generated from your applications, integrated Software-as-a-Service (SaaS) applications, and AWS services. EventBridge delivers a stream of real-time data from event sources to targets like AWS Lambda. You can set up routing rules to determine where to send your data to build application architectures that react in real time to your data sources with event publisher and consumer completely decoupled.

To achieve this, we define the following resolver for the putEvent Mutation operation.

Mutation: {
        putEvent: async (_: any, { msg }: any) => eventBridge.putEvents({
            Entries: [

                    EventBusName: process.env.BUS_NAME,
                    Source: "apollo",
                    DetailType: "message.request",
                    Detail: JSON.stringify({
                        message: message,
                        chatId: chatId,
                    }),
                },
            ],
        }).promise()
    },

putEvent publishes the message to the event bus. The result of the putEvent operation is sent back to the client to acknowledge the message.

Subscription

If a Subscription request is found, the function uses the AWS SDK for JavaScript to store connection details in DynamoDB. This allows the ResponseHandler function to later route responses received via EventBridge to the correct WebSocket connection.

Event-driven message processors

The architecture uses EventBridge to decouple Apollo Server from the actual processors of the received messages. Received messages are published to an event bus where consumers can subscribe to receive messages. In this case, two Lambda functions subscribed to messages with DetailType: "ClientMessageReceived". When such a message is received, EventBridge invokes the subscribed Lambda functions with it. Each Lambda function takes the message and calls Amazon Translate via the AWS SDK to translate it into German or French. Amazon Translate is a neural machine translation service that delivers fast, high-quality, affordable, and customizable language translation.

After the message is translated, the Lambda function writes the result into the event bus with DetailType: "ClientMessageTranslated".

The ResponseHandler Lambda function subscribes to every response event. When invoked with such an event, it queries DynamoDB to look up the connectionIds that subscribed to this conversation. Finally, it uses the postToConnection() call to trigger a server-side push of the response over the respective WebSocket connections.

async function getConnectionsSubscribedToTopic(topic: string) {
    const { Items: connectionId } = await dynamoDbClient.query({
        TableName: process.env.TABLE_NAME!,
        KeyConditionExpression: 'chatId = :chatId',
        ExpressionAttributeValues: {
        ':chatId': chatId,
        },
    }).promise();

    return connectionId;
}

export async function handler(event: EventBridgeEvent<"EventResponse", ResponseEventDetails>) {
    const connections = await getConnectionsSubscribedToTopic(event.detail.chatId);
    const postToConnectionPromises = connections?.map((c: any) => gatewayClient.postToConnection({
        ConnectionId: c.connectionId,
        Data: JSON.stringify({ data: event.detail.message })
    }).promise())

    const results = await Promise.allSettled(postToConnectionPromises!)

    return true
}

Cleaning up

Many services in this blog post are available in the AWS Free Tier. However, using this solution may incur cost, and you should tear down the stack if you don’t need it anymore. Cleanup steps are included in the readme in the repository.

Conclusion

This blog posts covered how you can use Apollo Server on AWS Lambda in an event-driven architecture. We showed how to use the Apollo Server on AWS Lambda, integrate it with REST and WebSocket APIs and communicate asynchronously via event bus.

Visit our documentation on event-driven architectures on AWS to learn more about how this can help build resilient, scalable, and cost-efficient architectures.

Markus Ziller

Markus Ziller

Markus Ziller is a Senior Solutions Architect at AWS with a passion for lean architectures and helping our customers build great applications and services with cloud-native, serverless technologies. Prior to this role, Markus worked in different roles in the media and entertainment industry, where he helped build the leading German video streaming service and drive the digital transformation of one of the largest traditional European mass media companies.

David Boldt

David Boldt

David Boldt is a Solutions Architect at Amazon Web Services. He helps customers build secure and scalable solutions that meet their business needs. He specialized in IoT and Robotics to address industry-wide challenges, leveraging technologies to drive innovation and efficiency across various sectors.