Front-End Web & Mobile

Building a Synchronization Endpoint in AWS Mobile Hub

This is the fifth part in a six-part series on synchronizing data within an Android mobile app to the AWS Cloud.  Check out the full series:

So far in this series, we have been building a notes app. We have built a local store (called a content provider) for the data and integrated Amazon Cognito into the standard Android Accounts page. Now it’s time to consider synchronization of data. Synchronization is considered in two parts: a backend component and a frontend component.

In the first post, we showed this diagram:

For the synchronization endpoint, we use Amazon DynamoDB for the data storage, and then provide two endpoints that the mobile app can consume:

  • POST /notes
  • GET /notes?lastUpdated=<number>

The POST /notes endpoint allows us to inject data into the DynamoDB table.  For this example, there are no sequencing issues.  I can update each record independently of all the others.  Sometimes, you need to code sequences and handle conflict resolution accordingly.  For example, if you produce a CRM application, you may want to add customers before adding the opportunities for those customers.  The GET /notes is the incremental sync endpoint.  It returns any record that was updated since the last synchronization.  Again, you may want to ensure proper sequencing in a more complex app – sync the customers before the opportunities.

Create the database

As we mentioned when developing the content provider, there is a client-side model and a separate server-side model. The two are generally not the same.

It is straightforward to create an appropriate DynamoDB table using AWS Mobile Hub. You created an AWS Mobile Hub project when you set up Amazon Cognito in the last post. You can extend this project to add the table:

  • Sign in to the AWS Mobile Hub console and select your project.
  • Click NoSQL Database.
  • Click Enable NoSQL.
  • Click Add Table.
  • Click Example to start with an example schema.
  • Click Notes as the example schema to use.

The example Notes schema needs a couple of modifications: there are two fields that are not present: isDeleted and updated.  You can add these using the following procedure:

  • Click Add attribute.
  • Enter isDeleted as the Attribute name.
  • Select boolean as the Type.
  • Click Add attribute again.
  • Enter updated as the Attribute name.
  • Select number as the Type.

When you synchronize, you perform two searches:

  • On insert/update, you search for the noteId + userId combination.
  • On retrieval, you search for the updated + userId combination.

The noteId and userId is a standard combination (userId is a partition key already and noteId is a sort key).  You add an index to the table to handle the updated field:

  • Click Add index.
  • Enter DateUpdated as the index name.
  • Select userId as the Partition key.
  • Select updated as the Sort key.

You are now ready to create the table.  Click Create table at the bottom of the page, and then confirm the action.  It takes some time to create a table, but you can see the progress on the screen.

Create the external API

As a best practice, you should not directly expose your database to the internet.  Instead, you should think about the operations your mobile app performs on the backend, and then write secure APIs to perform those functions alone.  In this case, you want to ensure that the userId is the authenticated user before insertion.  This is done within an AWS Lambda function that is exposed to the internet by Amazon API Gateway.  API Gateway ensures that the request is signed (so it is less likely to be from a rogue mobile app).  All requests are required to be authenticated, which further reduces the risk.

To build the external API, return to your AWS Mobile Hub project.  Click Configure if you are not on the main feature selection page.  Then do the following:

  1. Click Cloud Logic.
  2. Click Create new API.
  3. Fill in the form:
    • Enter sync as the API name.
    • Select Restrict API access to signed-in users.
    • Replace /items with /notes (click the pencil icon next to /items, and then click the tick icon).
    • Enter notesSyncHandler as the Lambda Function Name.
  4. Click Create API.

It takes a couple of minutes to create the API.  After the API is created, the infrastructure part is done.  You now need to introduce some code to do the work in the backend.  To do that, click the Actions drop-down and select notesSyncHandler under EDIT BACKEND CODE.  This opens the AWS Lambda console so you can edit the code.

Cloud Logic installs an “echo” service by default that takes the event that is passed in and returns it as the content to the response.  You can go immediately to Test API and test the GET and POST methods to see what happens.  Note that even though you restricted the API to authenticated users, you can still test the API.  The requestContext.identity.cognitoIdentityId (which is populated by the ARN of the authenticated user) is null when the API is tested from the console, but it is filled in with the ARN of the user when a mobile app accesses the endpoint.  As an example, here is the response from doing a GET /notes/foo:

{
  "requestBody": null,
  "pathParams": "/notes/foo",
  "queryStringParams": null,
  "headerParams": null,
  "stage": "test-invoke-stage",
  "stageVariables": {
    "stage": "Development"
  },
  "cognitoIdentityId": null,
  "httpMethod": "GET",
  "sourceIp": "test-invoke-source-ip",
  "userAgent": "AWS Console Mobile Hub, aws-internal/3",
  "requestId": "test-invoke-request",
  "resourcePath": "/notes"
}

You can (and should) handle each method separately, which means you need a router to handle the calling of different functions based on the HTTP method.  To assist with this, we moved the code for the handler into a function called baseHandler and wrote a new method that is invoked as the Lambda function:

'use strict';

exports.handler = function(event, context, callback) {
    console.log('event = ' + JSON.stringify(event));

    // Only deal with GET and POST
    var httpMethod = event.httpMethod.toUpperCase();
    if (httpMethod === 'GET') {
        return baseHandler(event, context, callback);
    } else if (httpMethod === 'POST') {
        return baseHandler(event, context, callback);
    } else {
        // Everything else is an error
        var response = {
            statusCode: 405,
            body: JSON.stringify({ "error": "Invalid HTTP Method" })
        };
        context.succeed(response);
    }
}

function baseHandler(event, context, callback) {
    console.log('Handling event in baseHandler');
    var responseCode = 200;
    
    // Rest of Lambda code here
}

After you edit the code, click Save, and then click Actions > Publish new version.  This last step makes your code active on the API.  It’s a good idea to familiarize yourself with some of the monitoring and debugging tools available.  When you test the API from within Cloud Logic, you are doing an HTTP call via the Amazon API Gateway control plane. This allows you to bypass a lot of the logic that controls the gateway for the API, such as authentication.  You can see the results in the window directly below the test button:

You can also see the headers and logs for the transaction.  However, you do not see the console log from AWS Lambda.  Click Edit function in Lambda console to enter the Lambda console.   Click Monitoring to see basic statistics for the Lambda execution.  Then click View logs in CloudWatch to see traces for individual executions.  You can see the console log within CloudWatch Logs.  This enables you to debug executions even when the execution is coming from a mobile phone that you do not have access to.

Let’s move on to the main problem – writing the API code to handle synchronization events. You can replace the calls to baseHandler() in your code with calls to specific handlers – postHandler() and getHandler(), for instance. Each handler can do the requisite work, returning an appropriate message. DynamoDB provides an asynchronous API you can use.

After the work is done (whatever it is), you set an appropriate response in the context and then call the callback method that is passed into the Lambda function. The code should work regardless of the Mobile Hub project it is put in. For this, you can take advantage of the standard environment variables provided by Mobile Hub:

  • MOBILE_HUB_PROJECT_REGION is the region where resources are deployed.
  • MOBILE_HUB_DYNAMIC_PREFIX is the prefix for generated resources, such as the DynamoDB table.
  • MOBILE_HUB_PROJECT_ID is the GUID for the Mobile Hub project
  • MOBILE_HUB_PROJECT_NAME is the (non-unique) project name you provided.

First, create a DynamoDB database connection using the AWS SDK for JavaScript.  Add the following code at the top of your Lambda function:

'use strict';
var AWS = require('aws-sdk');

// Update the region based on the environment
AWS.config.update({ region: process.env.MOBILE_HUB_PROJECT_REGION });

// Construct the DynamoDB document Client
var dbClient = new AWS.DynamoDB.DocumentClient();

The postHandler() function becomes the following:

function postHandler(event, context, callback) {
    try {
        var identity = event.requestContext.identity.cognitoIdentityId || 'test-user';
        
        if (!event.hasOwnProperty("body") || event.body === null || event.body === '') {
            return errorResponse(context, 400, "Body Required");
        }
        var item = JSON.parse(event.body); // throws SyntaxError
        if (typeof item !== 'object') {
            return errorResponse(context, 400, "Object expected");
        }
        
        // A noteId is required, otherwise it's a major error
        if (!item.hasOwnProperty('noteId')) {
            context.succeed({
                statusCode: 400,
                body: JSON.stringify({
                    message: "Object must include noteId"
                })
            });
            callback(null, { message: "Object must include noteId" });
        }
        
        // Construct the new item based on the provided information
        var newItem = {
            noteId: item.noteId,
            updated: (new Date()).getTime(),
            title: item.title || '',
            content: item.content || '',
            isDeleted: item.isDeleted || false,
            userId: identity
        };
        
        dbClient.put({
            TableName: `${process.env.MOBILE_HUB_DYNAMIC_PREFIX}-Notes`,
            Item: newItem
        }, function (err, data) {
            if (err) {
                console.log(err);
                context.succeed({
                    statusCode: 400,
                    body: JSON.stringify({ message: err })
                });
                // Our callback always succeeds, even on errors
                callback(null, { message: err });
            } else {
                dbClient.get({
                    TableName: `${process.env.MOBILE_HUB_DYNAMIC_PREFIX}-Notes`,
                    Key: { noteId: newItem.noteId, userId: newItem.userId }
                }, function (getErr, getData) {
                    if (getErr) {
                        console.log(getErr);
                        context.succeed({
                            statusCode: 400,
                            body: JSON.stringify({ message: getErr })
                        });
                        callback(null, { message: getErr });
                    } else {
                        context.succeed({
                            statusCode: 200,
                            body: JSON.stringify(getData.Item)
                        });
                        callback(null, getData);
                    }
                });
            }
        });
    } catch (Error) {
        return errorResponse(context, 400, "Invalid Request");
    }
}

The main functionality here is the dbClient.put() call near the end.  This actually injects the new item (or updated item) into DynamoDB.  After you update the data, you fetch the newly updated record with a get operation and return that data.  You can read more about the DynamoDB DocumentClient in the online API documentation. Save and publish the Lambda function now, and then run the POST test from within Cloud Logic:

You should also:

  • Go to Resources and click the Notes table link.  Then click the Items tab.  Ensure your record is inserted into the database.
  • Go to the CloudWatch Logs console to see the logs that are generated as a result.
  • Play around with the Cloud Logic API tester and note what happens when there is an error in the data that you provide.  Usually, it generates a 400 response.

Note that the combination of noteId + userId must be unique.  This means that one user cannot overwrite the work of another user.  Also, because you restricted the API to users that are logged in, the API cannot be used by rogue actors who are trying to circumvent the security you put in place.  Note that there may be other security corner cases that can result in a malicious actor misusing the API after the user is authenticated.  However, the exposure is reduced by using a combination of API Gateway, Lambda, and DynamoDB for the service-side data handling.

The getHandler() method uses the updated query parameter to limit the data that is provided back to the client.  The query parameters are provided as an object on the Lambda event and you can use this to limit the search:

function getHandler(event, context, callback) {
    try {
        var identity = event.requestContext.identity.cognitoIdentityId || 'test-user';
        var tableName = `${process.env.MOBILE_HUB_DYNAMIC_PREFIX}-Notes`;

        // Decode the query params
        var queryParams = event.queryStringParameters || {};
        var lastUpdated = parseInt(event.queryStringParameters.updated || '0');

        // Set up the search parameters
        var srch = {
            TableName: tableName,
            IndexName: 'DateUpdated',
            KeyConditionExpression: 'userId = :userId and updated > :updated',
            ExpressionAttributeValues: {
                ':userId': identity,
                ':updated': lastUpdated
            }
        };
        
        // Execute the search
        dbClient.query(srch, function (err, data) {
            if (err) {
                context.succeed({
                    statusCode: 400,
                    body: JSON.stringify({ message: err })
                });
                callback(null, { message: err });
            } else {
                context.succeed({
                    statusCode: 200,
                    body: JSON.stringify(data.Items)
                });
                callback(null, data);
            }
        });
    } catch (err) {
        console.log(`trapped error - err = ${JSON.stringify(err)}`);
        return errorResponse(context, 400, "Invalid Request");
    }
}

As before, save and publish the Lambda function, and then return to the API tester within the Mobile Hub Cloud Logic feature to test the API:

Insert several notes using POST, and then search for all of them by doing a GET with no query string.  Pick a time somewhere in the middle of the set, and then add the ?updated=value query string and verify that only some of the notes are returned.

Wrap up

The code we present here has a “last write wins” policy, which does not allow for conflict resolution.  More code would be necessary to write something that can detect conflicts and report those back to the client for handling within the UI. This is only one half of the synchronization code.  In the next post, we walk through the mobile client side of the synchronization code.