Front-End Web & Mobile

Implement Multi-Region Serverless (and Functionless) WebSocket Pub/Sub APIs with AWS AppSync and Amazon EventBridge

AWS AppSync allows developers to easily implement engaging real-time application experiences by automatically publishing data updates to subscribed API clients via serverless WebSockets connections.

With built-in support for WebSockets, AppSync can seamlessly push data to clients that choose to listen to specific events from the backend. This means that you can easily and effortlessly make any supported data source in AppSync real-time with connection management handled automatically between the clients and the API endpoint. Real-time data, connections, scalability, fan-out and broadcasting are all handled by AppSync, allowing you to focus on your application business use cases and requirements instead of dealing with the complex infrastructure to manage WebSockets connections at scale.

As businesses grow they need to support a global economy and connect to users based in different countries, while making data accessible with low latency and maintaining application responsiveness to provide a good user experience. Real-time applications allow to enhance user engagement by automatically pushing meaningful data or events to clients as soon as they happen. A common use case in collaborative applications is when users create, update or send data to a specific channel, topic or subject other users are actively listening or subscribed to, with data automatically pushed to specific groups of clients when certain conditions are met. With a global application, users in a country might need to automatically broadcast data to multiple users located in a different region.

However, a globally distributed application usually requires complex infrastructure based on a highly available architecture with lots of moving parts, which is harder for architects and developers to design, build and maintain. What if you could implement a highly available global multi-region data API with built-in support for real-time WebSockets with only two AWS services that automatically scale according to demand? All you need are serverless services that address your application integration needs and take care of all the undifferentiated heavy lifting of managing infra-structure at scale: AppSync itself and Amazon EventBridge, a serverless event bus that makes it easier to build event-driven applications at scale using events generated from backend services.

Overview

In this article we showcase how to implement a global WebSocket API where messages or events are automatically pushed to end user clients, web or native mobile, listening to a given channel in a pure and simple publish-subscribe (pub/sub) pattern. Users send and broadcast data in real-time to subscribed clients both locally and in another region.

Our global application allows for a seamless bi-directional data flow where users from one side of the planet can push data to users on the other side of the planet, and vice-versa. The workflow can be better understood with the following steps:

  1. A user sends a data payload to a specific channel containing a message with a simple GraphQL mutation.
  2. AppSync converts the payload to an event and sends it to an EventBridge event bus with the channel details and the message.
  3. In the same region, EventBridge re-publishes the event to AppSync using API destinations, which converts it back to a second GraphQL mutation (pub) unrelated to the first one. Clients in region receive the data as they are subscribed (sub) to the specific mutation EventBridge sends to AppSync in this step. The mutation invoked on step 1 has no subscribers linked to it. In order to send data to the other region EventBridge uses cross-region event routing, which forwards the event to another event bus that can be considered its cross-region counterpart with the same configuration.
  4. The source event bus assumes an IAM role with appropriate permissions. The event bus in the destination region receives the event.
  5. The event bus in the destination region executes a rule that sends the data received from the source event bus to the AppSync API destination in the same region.
  6. AppSync automatically pushes data to clients subscribed to specific channels. Clients in region and cross-region receive their messages in near real-time.

In the following sections we show how to quickly deploy the multi-region serverless backend to your AWS account using the AWS CDK. After that we use a simple React web app to test the end to end cross-region real-time messaging flow.

Enhanced Filtering

The solution implements the recently released real-time Enhanced Filtering feature now supported in AppSync. With the enhanced subscriptions filtering capabilities in AppSync, developers can create a broader range of real-time experiences in their applications by leveraging extra logical operators, server-side filtering, and the ability to trigger subscription invalidations. The additional filter operators give developers more control on the data specific groups of clients need to receive and invalidation makes it easier to close connections from the AppSync backend when certain conditions are met, providing more options when it comes to authorization use cases.  In addition to adding enhanced control over which data is sent to subscribed clients, backend defined filters simplify application code and reduce the amount of data sent to clients.

We implement a single filter allowing users to send or receive messages to 5 specific channels. Users can still send messages to any channel they define however a backend filter in AppSync discards the request if it doesn’t match the filtering logic. Clients don’t receive data unless messages are sent to the allowed channels:

api.addSubscription('subscribe', new ResolvableField({
      returnType: channel.attribute(),
      args: { name: GraphqlType.string({ isRequired: true }) },
      directives: [Directive.subscribe('publishFromBus')],
      dataSource: pubsub,
      requestMappingTemplate: MappingTemplate.fromString(`
        {
          "version": "2017-02-28",
          "payload": {
              "name": "demo",
              "message": "AppSync enhanced filtering and invalidation"
          }
        }`
      ),
      // Setting up filters 
      responseMappingTemplate: MappingTemplate.fromString(`
        $extensions.setSubscriptionFilter({
          "filterGroup": [
            {
              "filters" : [
                {
                  "fieldName" : "name",
                  "operator" : "in",
                  "value" : ["cars","robots","tech","music","media"]
                }
              ]
            }
          ]
        })
        $extensions.setSubscriptionInvalidationFilter({
            "filterGroup": [
              {
                "filters" : [
                  {
                    "fieldName" : "name",
                    "operator" : "eq",
                    "value" : $context.args.name
                  }
                ]
              }
            ]
          })
        $util.toJson($context.result)
        `)
    }));

AppSync can trigger an invalidation event by invoking an unsubscribe mutation and informing the channel name. All subscribed clients are then automatically unsubscribed from the channel:

mutation Unsubscribe {
  unsubscribe(name: "tech") {
    name
  }
}

The mutation above triggers an invalidation filter and forcibly closes the WebSocket connection of any client subscribed to the channel tech. However, it’s important to make sure clients cannot unsubscribe other clients. Ideally only a backend process or service with specific permissions should be allowed to close clients WebSocket connections. Alternatively, the AppSync console itself can be used to perform this type of administrative task. This scenario can be easily implemented leveraging AppSync’s built-in support for multiple authorization modes in a single API.

End user clients use API Keys to access the API but only clients authorized with AWS IAM can execute the mutation to invalidate and unsubscribe other clients:

If a regular client tries to execute the unsubscribe mutation, it receives an authorization error:

What more can we do? (Next Steps)

  • Integrate with existing workloads

In case you just need to implement a real-time feature in an existing globally distributed application, the architecture we share here can be easily integrated to any application or API technology. While there are advantages in using a single API endpoint to securely access, manipulate, and combine data from one or more data sources in a single network call with GraphQL, there’s no need to convert or rebuild an existing REST based application from scratch in order to take advantage of AppSync’s real-time capabilities. For instance, you could have an existing CRUD workload in a separate API endpoint with clients sending and receiving messages or events from the existing application to our global AppSync APIs for real-time and pub/sub purposes only. The APIs are based on a generic implementation that can be used in any scenario that requires data to be pushed to any number of WebSocket clients listening to a channel. It can be easily integrated, modified or enhanced accordingly.

The APIs are currently configured to send a simple string as the message payload. The CDK code can be modified so messages are sent as generic JSON data instead with no specific shape or strongly typed requirements:

const channel = new ObjectType('Channel', {
      directives: [Directive.iam(),Directive.apiKey()],
      definition: {
        name: GraphqlType.string({ isRequired: true }),
        message: GraphqlType.awsJson({ isRequired: true }),
      },
    });

    api.addType(channel);

GraphQL provides a strongly typed system out of the box. We can further enhance our Channel definition in CDK to mix and match typed and generic JSON data by adding fields to identify user IDs, e-mails, likes, or adding a date when data is sent to the channel, then use generic JSON in the message field if required:

const channel = new ObjectType('Channel', {
    definition: {
        name: GraphqlType.string({ isRequired: true }),
        date: GraphqlType.awsDate({ isRequired: false }),
        likes: GraphqlType.int({ isRequired: false }),
        userId: GraphqlType.id({ isRequired: true }),
        userEmail: GraphqlType.awsEmail({ isRequired: false }),
        message: GraphqlType.awsJson({ isRequired: true })
    },
});

If there’s a change to the Channel type definition, be mindful the new fields need to be reflected in both EventBridge event buses API destinations input transformers so subscribed clients receive the data accordingly.

  • Make it even more Event-driven

EventBridge provides scalable and flexible serverless event buses allowing to deliver a stream of real-time data from event sources such as Zendesk, Shopify or other backend services to targets like AWS Lambda, other AWS services and other SaaS applications. You can set up routing rules to determine where to send your data and build application architectures that react in real-time to your data sources with event publishers and consumers completely decoupled. With AppSync serverless WebSockets capabilities, you can effectively add the “last mile” to your workload and seamlessly enable backend events from your business to be delivered in real-time all the way from existing event-driven application architectures to front-end clients, web or mobile.

  • Data persistence

While the solution we showcase in this article can be classified as “Functionless” since it doesn’t required AWS Lambda serveless functions, it also doesn’t persist messages as they are not saved in a database. If there’s a requirement to persist data in a highly scalable database such as Amazon DynamoDB while supporting global WebSockets for real-time use cases, the architecture can be slightly modified by implementing three AWS serverless services as opposed to two, as discussed in a recently published article in this blog.

  • Active/Active and Active/Passive access

Disaster Recovery strategies can be categorized as Active/Active or Active/Passive and the solution in this article can be extended to implement either of these two strategies.

Active/Passive workloads operate from a single AWS Region, which handles all requests. The infrastructure is duplicated in a passive Region. If a disaster event occurs and the active Region cannot support the workload operation, the passive site becomes the recovery site. With AWS AppSync custom domains, an Active/Passive approach can be achieved out of the box simply by re-assigning the domain (either manually or programmatically) from the primary region to the passive endpoint in the healthy region in case of a disaster. You could modify the solution to use EventBridge global endpoints if there’s a requirement to fail over event ingestion automatically to the secondary region during service disruptions. Since both AppSync and EventBridge are serverless, you are not charged for services in the passive region if they are not used.

If an Active/Active configuration is desired with a single custom domain, this solution can be extended by placing an Amazon CloudFront distribution in front of AWS AppSync in each region. In this case, clients in both Sydney and Oregon can connect to a single DNS domain. The distribution would then have four origins, the GraphQL endpoint for each region (https://xxxxxxxxxxxx.appsync-api.<region>.amazonaws.com/graphql) and the real-time endpoint for each region (wss:// xxxxxxxxxxxx.appsync-realtime-api.<region>.amazonaws.com/graphql), taking advantage of the built-in WebSockets support in CloudFront. A Lambda@Edge function can then be used to proxy and direct traffic from CloudFront to the endpoint with the lowest latency. You can find more information in the Multi-region GraphQL API with CloudFront reference architecture.

  • Authorization

We set up both AppSync regional endpoints to use API Key Authorization for clients and IAM authorization for a specific operation to invalidate connections and unsubscribe users. Clients must present a valid API key with each request to be authenticated. The API key is a hard-coded value in your application that is generated by the AppSync service in each endpoint.

AppSync supports a number of different authorization methods to suit different use cases. AWS Lambda authorization allows you to implement your own custom authorization logic using a Lambda function. In an Active/Active configuration as detailed above, Lambda could be configured as the default authorization mode for all AppSync endpoints, which would allow the same authorization token to be used in both regions.

Other authorization modes supported in AppSync include OpenID Connect (OIDC), Amazon Cognito User Pools, and AWS IAM authorization. They can be mixed and matched in the same API for specific operations, types or fields. For more information, refer to the AppSync documentation.

Deploying the Global API backend with CDK

In order to deploy the solution, we need some tools:

Clone the GitHub repository to your local computer:

$ git clone https://github.com/aws-samples/aws-global-pubsub-api

Change the working directory to:

$ cd aws-global-pubsub-api/cdk

Install the project dependencies:

$ npm install

Finally, deploy four CDK stacks to your AWS account with a single command. By default the regions used are Oregon (us-west-2) and Sydney (ap-southeast-2). After deployment, the output of the first two stacks display the GraphQL APIs endpoint, API IDs, and API keys. Take note of all the details as they are needed to setup clients later:

$ cdk deploy --all

Setting up a sample React client

Now that our global WebSocket API backend is deployed, we’re ready to configure different WebSocket clients to receive data. We leverage the AWS Amplify libraries to connect clients to a backend API in a given region. While in this article we use React as our client of choice to create a simple web app, Amplify libraries also support iOS, Android, and Flutter clients, providing the same capabilities in these different runtimes. The supported Amplify clients provide simple abstractions to interact with AppSync GraphQL API backends with few lines of code, including built-in real-time WebSocket capabilities fully compatible with the AppSync WebSocket real-time protocol out of the box.

We start by accessing the client folder in the same cloned repository we used previously, then install the required dependencies:

$ cd ../client
$ npm install 

Next go to the file src/App.js and update the AppSync API configuration details based on the output of the previous cdk deploy command. You can connect the client to your API of choice (Oregon or Sydney). You could also duplicate the client folder and have a different instance of each client connecting to a different API in order to test multi-region subscriptions.

//AppSync endpoint settings
const myAppConfig = {
  aws_appsync_graphqlEndpoint:
    "https://xxxxxxx.appsync-api.us-west-2.amazonaws.com/graphql",
  aws_appsync_region: "us-west-2", //or ap-southeast-2
  aws_appsync_authenticationType: "API_KEY",
  aws_appsync_apiKey: "da2-xxxxxxxxx",
};

While the code repository already has the necessary GraphQL operations defined,  developers can use Amplify to automatically generate GraphQL client code if necessary. Notice that with the simple CDK code-first approach we use in the backend coupled with Amplify’s code generation capabilities for the front-end, there’s no need to be familiar with GraphQL concepts like schemas or operations as they are automatically generated by the development tools themselves. You don’t need to be a GraphQL expert to use global WebSockets and real-time with AppSync.

We’re ready to test with different clients. Back to the client folder, start the React app with the following command:

$ npm start

Open http://localhost:3000 in a browser and initialize the WebSocket connection by selecting the channel tech and sending any message. The react application is now listening for published messages.

We use cURL in different terminal windows with the following commands to publish messages, replacing API endpoints and API key details for each region accordingly:

$ curl 'https://xxxxxxxxxxx.appsync-api.us-west-2.amazonaws.com/graphql' \
-H 'content-type: application/json' \
-H 'x-api-key: da2-xxxxxxxxxxxx' \
--data-raw $'{"query":"mutation publish($message: String\u0021, $name: String\u0021) {\\n publish(message: $message, name: $name) {\\n message\\n name\\n }\\n}\\n","variables":{"name":"tech","message":"Hello from Oregon!"}}' \


$ curl 'https://xxxxxxxxxxxxx.appsync-api.ap-southeast-2.amazonaws.com/graphql' \
-H 'content-type: application/json' \
-H 'x-api-key: da2-xxxxxxxxxxxxx' \
--data-raw $'{"query":"mutation publish($message: String\u0021, $name: String\u0021) {\\n publish(message: $message, name: $name) {\\n message\\n name\\n }\\n}\\n","variables":{"name":"tech","message":"Hello from Sydney!"}}' \

The terminal clients publish a message to their respective API endpoints and we can confirm the React client subscribed to the channel via WebSockets receives all messages as expected.

In the web app you can define a channel in the text input that’s not in the drop-down menu with the five enabled channels, then try to send a message. You can confirm the message is not displayed in the message board as the backend filter in AppSync blocks it automatically.

Our global serverless and functionless pub/sub APIs are deployed, ready, and working as expected! If you prefer to visualize both backend (CDK/TypeScript) and frontend (Amplify/React) code in a repository, it’s all available on GitHub.

Conclusion

In this article we saw how organizations can scale their real-time pub/sub data workloads globally in the cloud with low latency for geographically dispersed users with only two AWS services. End users connect to the GraphQL API endpoint closest to them and have access to the same data replicated between AWS regions automatically with AppSync and EventBridge. This particular implementation is both serverless and functionless, as there’s no need to use Lambda functions. Serverless technologies feature automatic scaling, built-in high availability, and a pay-for-use billing model to increase agility and optimize costs. Developer teams can focus on their solutions and business problems as opposed to spending time on infrastructure management tasks like capacity provisioning and patching. You can find more information about serverless real-time data via WebSockets in the AppSync documentation.