Front-End Web & Mobile

Enhancing Amazon DynamoDB single-table design with AWS AppSync access and security features

AWS AppSync is a fully managed service allowing developers to deploy scalable GraphQL backends on AWS. Developers utilize AppSync to connect applications to data sources such as Amazon DynamoDB tables. AppSync’s flexibility lets you utilize new or existing tables, using either a single-table design or a multi-table approach. In a single-table design, your different data types are stored in one DynamoDB table. One advantage of this is that it lets you retrieve different item types in a single request. This can be really efficient, but it requires a good understanding of your access patterns and needs planning before you begin building your tables and GraphQL schema. In a multi-table approach, you store different item types in separate tables. Because no strong association exists between the overall access patterns and the tables, it lets you begin your data modeling a bit faster. In either case, GraphQL makes it easier for your clients to access your data.

This article will show how you can enhance and protect data access while leveraging single-table design on your DynamoDB table. We show you how to leverage AWS AppSync features to implement security, improve performance, and incorporate different access patterns.

Getting started with AWS AppSync and DynamoDB with a single-table design is simple. Dive in immediately by leveraging patterns on Serverless Land:

Below, we explore a specific single-table design application and show how you can enhance it with AppSync features.

Use case

Let’s use a scenario to explain the use case. We implement a system to support a school offering classes to thousands of students. The school provides a website where administrators and instructors can pull up the registration list for specific classes. Students and guests can also use the site to view available classes. Students can sign up for classes that are not yet full. Overall, the website must allow the following access patterns:

  • Administrators and instructors can fetch class roster information.
  • Administrators can see students’ majors (area of focus).
  • Anonymous users can view the list of classes for a semester.
  • Logged-in students can register for classes.
  • Logged-in students can see the list of classes they have enrolled in.
  • Logged-in students should see real-time updates of their registration status.

The DynamoDB table

The DynamoDB table for this use case stores data on the classes available. It has information about the registrations for each class. The primary index allows retrieval of course and registration information by querying via a class ID. There are two global secondary indexes on the table. The first, “bySemester”, allows querying for all classes and registrations in a semester. The other, “byStudentID”, allows the querying of information for a specific student ID.

DynamoDB table explained in post.

This table was designed in the NoSQL Workbench for Amazon DynamoDB. Recreate it by using the NoSQL Workbench JSON model at the end of this article.

AppSync and the GraphQL Schema

To get started, we set up a new AWS AppSync GraphQL API by using an API key as the default authorization mode. We create a DynamoDB data source called registrations by using the existing Amazon DynamoDB table, and then create this schema.

type Class {
    id: ID!
    name: String!
    registrations: RegistrationConnection
}

type Registration {
    id: ID!
    name: String
    major: String
}

type RegistrationConnection {
    items: [Registration!]
    nextToken: String
}

type Query {
    getRegistrations: [Registration]
    allClasses(semester: String!): [Class]
    getClass(id: ID!, limit: Int): Class
    getMoreRegistrations(courseId: ID!, nextToken: String!, limit: Int): RegistrationConnection
}

Fetching classes for a semester

To fetch a list of classes for a specific semester, we utilize the allClasses query. In the request mapping template for allClasses, we define a query operation on the bySemester index. Then, we utilize the semester argument as the primary key, and scope down the query to all items with a typeWithID (the sorting key) starting with class#. This ensures that the query only returns classes in the response.

{
    "version": "2017-02-28",
    "operation": "Query",
    "index": "bySemester",
    "query" : {
        "expression" : "#k = :k and begins_with(#s, :s)",
        "expressionNames" : {
            "#k" : "semester",
            "#s" : "typeWithID"
        },
        "expressionValues" : {
            ":k" : $util.dynamodb.toDynamoDBJson($ctx.args.semester),
            ":s" : $util.dynamodb.toDynamoDBJson("class#")
        }
    }
}

The response mapping template simply passes the data back.

## Pass back the result from DynamoDB. **
$util.toJson($ctx.result.items)

Because the list of classes for a semester is fairly static, there is no need to always retrieve the list from the DynamoDB table. Instead, we can activate per-resolver caching on the AppSync GraphQL API, and then enable caching on the allClasses resolver.

In the console, we enable per-resolver caching on the API:

In the consle, under API Cache, under Caching behavior "Per-resolver caching" is selected.

We can then utilize the CLI to enable caching with a TTL of one hour (3600 seconds), and specify caching keys on the resolver. Given this configuration in a file resolver-update.json:

{
    "apiId": "<API_ID>",
    "typeName": "Query",
    "fieldName": "allClasses",
    "dataSourceName": "registrations",
    "requestMappingTemplate": "<TEMPLATE>",
    "responseMappingTemplate": "<TEMPLATE>",
    "kind": "UNIT",
    "cachingConfig": {
        "ttl": 3600,
        "cachingKeys": [ "$context.arguments.semester" ]
    }
}

We can update the resolver by using the AWS CLI and the update-resolver operation:

aws AppSync update-resolver --cli-input-json file://resolver-update.json

Then, run a quick test on the API to evaluate the performance. Utilize vegeta to query the API at 50 transactions per second for 60 seconds.

API_URL="<API_URL>"
API_KEY="<API_KEY>"
echo $API
jq -ncM '{method: "POST", url: "'$API_URL'", body: {query: "query vegeta { allClasses(semester: \"SPRING2021\") { id name } }"} | @base64, header: {"content-type": ["application/json"], "x-api-key": ["'$API_KEY'"]}}' | vegeta attack -format=json -duration=60s | tee report.bin | vegeta report

The difference in latency is clear when the test is executed, both with and without caching.

No caching has a median latency of about 18 milliseconds and with caching has a median latency of about 4.

You can see in x-ray that the queries are going to the AppSync cache and not hitting the DynamoDB table:

Client to AWS:AppSyncGraphQLAPI, 4ms 1 request to AppSync Cache, 3ms 1 request.

Fetching a class roster and managing pagination

Fetch information about a class along with the student roster by using the getClass query. The query accepts a class ID and an optional limit to restrict how many registrations are returned in a single request. This is useful for limiting the amount of data returned at once to the client, since there can be many registrations.

The single-table design lets us easily query the table for every class with a primary key that identifies the class class#<ID> and retrieves a list of items. Since the sort key for the query has the format class#<ID> and class#<ID>#studentID, the items will always be returned in an order starting with the class and followed by the list of registrations in the class.

Table explained in post.

Because we always expect to get one class and zero or more registrations, we set the actual query limit sent to DynamoDB to one more than is specified in the query:

#set( $limit = $util.defaultIfNull($ctx.args.limit, 10) + 1)

{
    "version": "2017-02-28",
    "operation": "Query",
    "query": {
      "expression": "#PK = :PK and begins_with(#SK, :SK)",
      "expressionNames": {
        "#PK": "PK",
        "#SK": "SK"
      },
      "expressionValues": {
        ":PK": $util.dynamodb.toDynamoDBJson("course#${context.args.id}"),
        ":SK": $util.dynamodb.toDynamoDBJson("course#${context.args.id}"),
      }
    },
    "limit" : $limit,
}

Since we know the order in which the items are returned, the response template does this:

## Pass back the result from DynamoDB. **
#set( $class = $ctx.result.items.get(0) )
$util.qr($class.put("registrations", {
  "items": $util.list.copyAndRemoveAll($ctx.result.items, [$class]),
  "nextToken": $ctx.result.nextToken
}))
$util.toJson($class)

The utility $util.list.copyAndRemoveAll removes the class from the list (the first item) and returns the registrations (you could also sort the list easily by using the $util.list.sortList utility). Furthermore, the query returns the nextToken, which we can utilize to fetch more registrations in getMoreRegistrations. This query allows pagination through the results and the fetching of more registrations. The template is the same as above, except that no class is expected to be returned,

#set( $limit = $util.defaultIfNull($ctx.args.limit, 10))

{
    "version": "2017-02-28",
    "operation": "Query",
    "query": {
      "expression": "#PK = :PK and begins_with(#SK, :SK)",
      "expressionNames": {
        "#PK": "PK",
        "#SK": "SK"
      },
      "expressionValues": {
        ":PK": $util.dynamodb.toDynamoDBJson("course#${context.args.courseId}"),
        ":SK": $util.dynamodb.toDynamoDBJson("course#${context.args.courseId}"),
      }
    },
    "limit" : $limit,
    "nextToken": $util.toJson($util.defaultIfNull($ctx.args.nextToken, null))
}

and the response template return the next set of items and the next token.

$util.toJson($ctx.result)

Addressing security concerns

The API currently uses the API_KEY auth mode in order to access the API. Any site or client with the key can access all of the data via the API. While we want the class list to remain available, the class roster should be available only to administrators. We can add AMAZON_COGNITO_USER_POOLS as an additional mode of authorization in order to allow signed-in Cognito User Pool users to access that data. And we can set some specific auth rules on the schema in order to limit access.

type Class @aws_api_key @aws_cognito_user_pools {
    id: ID!
    name: String!
    registrations: RegistrationConnection
}

type Registration @aws_cognito_user_pools {
    id: ID!
    name: String
    major: String
}

type RegistrationConnection @aws_cognito_user_pools {
    items: [Registration!]
    nextToken: String
}

type Query @aws_api_key @aws_cognito_user_pools {
    getRegistrations: [Registration]
    allClasses(semester: String!): [Class]
    getClass(id: ID!, limit: Int): Class
        @aws_cognito_user_pools(cognito_groups: ["admins","instructors"])
    getMoreRegistrations(courseId: ID!, nextToken: String!, limit: Int): RegistrationConnection
        @aws_cognito_user_pools(cognito_groups: ["admins","instructors"])
}

Now, while everybody can see a list of classes, only signed-in cognito users that are part of the admins and instructors group can retrieve class information. We can keep adding granularity at the field level. One requirement is to only allow administrators to see students’ majors (area of focus). We ensure that only admins can do this by specifying the following:

type Student @aws_cognito_user_pools {
  id: ID!
  name: String
  major: String @aws_cognito_user_pools(cognito_groups: ["admins"])
}

Allowing students to view their own registration

Students should be able to see their own registration. This can easily be done by leveraging the ByStudentID Global Secondary Index. First, we update the schema to only allow users in the students group to call getRegistrations:

type Class @aws_api_key @aws_cognito_user_pools {
    id: ID!
    name: String!
    registrations: RegistrationConnection
}

type Registration @aws_cognito_user_pools {
    id: ID!
    name: String
    major: String
        @aws_cognito_user_pools(cognito_groups: ["admins","students"])
}

type RegistrationConnection @aws_cognito_user_pools {
    items: [Registration!]
    nextToken: String
}

type Query @aws_api_key @aws_cognito_user_pools {
    getRegistrations: [Registration]
        @aws_cognito_user_pools(cognito_groups: ["students"])
    allClasses(semester: String!): [Class]
    getClass(id: ID!, limit: Int): Class
        @aws_cognito_user_pools(cognito_groups: ["admins","instructors"])
    getMoreRegistrations(courseId: ID!, nextToken: String!, limit: Int): RegistrationConnection
        @aws_cognito_user_pools(cognito_groups: ["admins","instructors"])
}

To ensure that a student only retrieves their own registrations, we utilize the identity information ($ctx.identity.sub) provided by AppSync from the verified JWT token that the student used to authorize the request.

{
    "version": "2017-02-28",
    "operation": "Query",
    "index": "byStudentID",
    "query" : {
        "expression" : "#PK = :PK and begins_with(#SK, :SK)",
        "expressionNames" : {
            "#PK" : "typeWithID",
            "#SK" : "PK"
        },
        "expressionValues" : {
            ":PK" : $util.dynamodb.toDynamoDBJson("student#$ctx.identity.sub"),
            ":SK" : $util.dynamodb.toDynamoDBJson("course#")
        }
    }
}

We return the list of results directly in the response mapping template:

$util.toJson($ctx.result.items)

Furthermore, we can activate caching on this resolver with a $ctx.identity.sub caching key, and choose an appropriate TTL (e.g., 10 minutes instead of one hour).

Note that in this example we utilize AMAZON_COGNITO_USER_POOLS. If you use a different OIDC identity provider, you can utilize the OPENID_CONNECT authorization mode and the @aws_oidc directive to configure access. With OPENID_CONNECT, while you cannot directly specify which group to allow in the directive, you can access the token’s claims in your mapping templates and deny access as needed.

Update information and signing up for classes

We can implement a mutation on the DynamoDB data source to update information in the table. We have the choice between utilizing UpdateItem, PutItem, or even TransactWriteItems if we must make multiple writes in a single transaction. See the reference for DynamoDB in the AppSync documentation for available options. However, this system type may need to complete complex business logic with external systems before committing a write to the DynamoDB table. AppSync makes it easy to connect to microservices by connecting directly to serverless on AWS. Get started with these implementations by visiting the serverless patterns for Direct Lambda Resolver with AWS Lambda, Amazon EventBridge, AWS Step FunctionsAmazon SNS, and Amazon SQS.

A student can register by using the register mutation, and an external system can provide a registration update by using updateRegistration. Updating a registration is limited to internal services that utilize IAM permissions by using the @aws_iam directive.

enum RegistrationEnum {
    IN_PROGRESS
    DONE
}

type RegistrationStatus {
    status: RegistrationEnum
}

type Mutation {
  register(classId: ID!): RegistrationStatus @aws_cognito_user_pools(cognito_groups: ["students"])
  updateRegistration(classID: ID!, studentID: ID!): Registration @aws_iam
}

Providing realtime updates

Finally, students registering for classes will want to see updates in real-time in the website. To do this, we implement a subscription that subscribes to data changes from the updateRegistration mutation:

type Subscription {
  onUpdate(classID: ID!, studentID: ID!): Registration @aws_subscribe(mutations: ["updateRegistration"])
}

We add a check to the resolver mapping template that only allows a student to listen for registrations changes on their student ID. If a client provides a student ID that is not equal to the requester’s identity, then the subscription request is denied.

#if( $ctx.identity.sub != $ctx.args.studentID )
  $util.unauthorized()
#end
$util.toJson({"version":"2018-05-29","payload":{}})

The response mapping template is simply:

$util.toJson(null)

This realtime solution means that we do not have to implement polling from the website that would put additional load on the DynamoDB table.

Conclusion

AWS AppSync is a flexible data API allowing you to connect to new or existing data sources in the cloud. It provides a fully-compliant GraphQl API that makes it easier for developers to build applications and interact comfortably with the data they need. With AppSync, you are not limited to interacting with data in a single way. This article showed that you can leverage DyanmoDB single-table design, and even enhance it to scale your application, as well as improve your security posture. To discover more about the ways that AppSync can connect to other services on AWS, visit the AppSync Patterns on Serverless Land.


NoSQL Workbench JSON model

{
  "ModelName": "SchoolRegistrations-Import",
  "ModelMetadata": {
    "Description": "school registrations example",
    "AWSService": "Amazon DynamoDB",
    "Version": "3.0"
  },
  "DataModel": [
    {
      "TableName": "Registrations",
      "KeyAttributes": {
        "PartitionKey": { "AttributeName": "PK", "AttributeType": "S" },
        "SortKey": { "AttributeName": "SK", "AttributeType": "S" }
      },
      "NonKeyAttributes": [
        { "AttributeName": "semester", "AttributeType": "S" },
        { "AttributeName": "className", "AttributeType": "S" },
        { "AttributeName": "id", "AttributeType": "S" },
        { "AttributeName": "typeWithID", "AttributeType": "S" },
        { "AttributeName": "studentID", "AttributeType": "S" },
        { "AttributeName": "studentName", "AttributeType": "S" },
        { "AttributeName": "major", "AttributeType": "S" }
      ],
      "GlobalSecondaryIndexes": [
        {
          "IndexName": "bySemester",
          "KeyAttributes": {
            "PartitionKey": { "AttributeName": "semester", "AttributeType": "S" },
            "SortKey": { "AttributeName": "typeWithID", "AttributeType": "S" }
          },
          "Projection": { "ProjectionType": "ALL" }
        },
        {
          "IndexName": "byStudentID",
          "KeyAttributes": {
            "PartitionKey": { "AttributeName": "typeWithID", "AttributeType": "S" },
            "SortKey": { "AttributeName": "PK", "AttributeType": "S" }
          },
          "Projection": { "ProjectionType": "ALL" }
        }
      ],
      "TableData": [],
      "BillingMode": "PROVISIONED",
      "ProvisionedCapacitySettings": {
        "ProvisionedThroughput": { "ReadCapacityUnits": 5, "WriteCapacityUnits": 5 }
      }
    }
  ]
}