Front-End Web & Mobile

An Introduction to the Sync Framework for Android

This is the first part in a six-part series on synchronizing data within an Android mobile app to the AWS Cloud.  Check out the full series:

There are common themes for mobile developers to be aware of.  Your app should be well designed, for instance.  Even enterprise apps should have an intuitive design so that the user doesn’t need a manual to use the app.  There are two other important themes.  First, your app must be responsive, which means users should not have to wait for data, even when they are not on the network.  Second, your app should optimize for the “3 B’s” – battery, bandwidth, and bytes.  Your app should not significantly drain the battery, it should conserve bandwidth, and it should occupy as little space on the device as possible.

Android has a solution to access cloud-based data.  In this post, we introduce the concepts.  Later posts cover the component design and how to implement the components to use AWS resources.

Content providers

We use a small notes app as an example. The app should be personalized (i.e., only I can see my posts) and the data should be stored in the cloud.  You could create a service within your app and access the data across the wire from an Amazon DynamoDB database.  However, this has several problems.

  1. The user may be out of network coverage, or in a coverage area where he or she needs to turn off data (for example, international roaming).
  2. The link may be slow or have significant packet loss that causes disconnections.
  3. You may want to provide the data to other packages.  For example, you may want to provide a Notes widget on the home screen, or be able to copy and paste the data to another application.
  4. You want to use the sync framework to provide offline synchronization through the standard Sync Manager on Android.

Three of these four items are relevant to almost all data-driven applications.  In the Android SDK, a ContentProvider provides a useful asynchronous interface between your data and your application. It also provides an interface between your data and other subsystems, such as widgets, other applications, and the sync framework.

Implementing a ContentProvider requires three classes within your application:

  1. A model of the data
  2. A contract between your application and the content provider
  3. The ContentProvider implementation

After you implement the ContentProvider, you can register it with your application using the manifest. You then use the loaders framework to asynchronously load data from the underlying data source.

A typical application looks like this:

The ContentResolver and DatabaseHelper are written for you, although there is some work to wire them into your application.  Android includes a SQLite implementation in the core OS, so you don’t need to include that.  You do not have to use SQLite, however.  The ContentProvider specifies which data storage is used, so you can use a NoSQL store like Couchbase Lite instead of the SQLite database.  The ContentProvider can also access files (such as photos, audio recordings, or video clips) on the internal storage of the device and return those files.  Android has built-in providers for several common scenarios, including calendar, contacts, media, and telephony (SMS and MMS) messages.

Finding your data

Given that one of the advantages of content providers is that other applications can use the data, you need to provide a mechanism by which the data can be found.  Your application queries the content resolver for a specific URI.  This will be something of the form:

content://your.provider/table-name

You can also access a specific entity within your table by appending the entity ID to the end:

content://your.provider/table-name/id

There are methods within the content resolver (that correspond to the equivalent methods within the content provider) for accessing and mutating the underlying data.  The content provider is responsible for converting those calls into SQL queries to be executed against the SQLite database.

Synchronizing your data

Now that you produced a ContentProvider, you can synchronize the data to the cloud.  This involves getting to know the Android sync framework.  From your Android device, select Settings > Cloud and Accounts > Accounts. Note that some accounts have sync settings.  You can turn off the sync settings or adjust them.

As with the content provider, you could produce your own sync service that only runs when your application is running.  However, integrating with the Android sync framework has some benefits:

  • The system schedules synchronization to run when other sync operations run or when the network is active already, which makes the process more battery efficient.
  • The user is in charge of synchronization because your app follows the same pattern as other apps, which means the user doesn’t need to learn a separate pattern.
  • There is an automatic mechanism for retries, which includes timeouts and exponential backoffs.

You need to write the sync logic, which isn’t straightforward.  Even for basic synchronization tasks, your data model must be “sync ready”.  You must have:

  • A globally unique ID (auto-incrementing integers are not suitable in sync scenarios)
  • A “Last Updated” time stamp (so you can implement incremental synchronization)
  • A checksum (so you can implement conflict resolution)

Because you are writing the synchronization logic, you have significant flexibility when you implement the process. The model on the server and the client does not need to be the same. You can decide to store data locally that is not synchronized to the cloud.  You can similarly ignore certain data that is in the cloud model so that it is not stored locally.  You can also implement read-only data (data that is always loaded from the cloud, overwriting what is in the client).

There are some things to think about:

  1. How are you going to recognize new and updated records that need to be uploaded to your cloud data source?  This can be an modifiedOnClient Boolean field that is set to true when you create or update records, or it can be an operations queue that stores updated records.
  2. How are you going to manage records that are deleted on the server?  This can be as simple as a deleted Boolean field that is set on deletion.  You can then ignore deleted records in your queries from the server and delete the records automatically when downloading data from the server.

When you add synchronization, the architecture of the app is significantly changed:

When a synchronization happens, the sync adapter communicates with the content provider and the internet service that contains the copy of the data. The sync adapter updates the local database appropriately.  It uses the authenticator service to get access to those resources.  If, for instance, this is the LinkedIn sync adapter (which is installed when you install the LinkedIn app), then the authenticator service authenticates against LinkedIn and provides the appropriate credentials to the backend service.

The synchronization process

Synchronization is generally a multi-step process.  Assuming you have set up the model and content provider as suggested, here is a recommended process:

  1. Send all new and updated records to the server.
    1. Search the ContentProvider for all records with modifiedOnClient = true.
    2. Send those records as a JSON array to the mutation endpoint of the service.
    3. The mutation endpoint processes each record and responds with either the new record (with lastUpdated = latest server-side time stamp and potentially other fields updated) or an error (either a constraint violation or a conflict).
    4. Store the updated records back to the ContentProvider.
    5. Deal with the conflicts in the UI.
  2. Download updated records (since the lastUpdated time stamp) from the query endpoint of the service and store them in the ContentProvider.  Delete any records that have deleted = true.
  3. Store the lastUpdated time stamp.

If you have a good network link, you can use bulk uploads and compression to aid your bandwidth utilization (and speed of synchronization). If you do not have a good network link, then you should transfer the data in small batches. As you download new records in batches, store the lastUpdated time stamp for the batch so you can pick up where you left off.

The backend service

The backend service can be any RESTful set of endpoints. In general, you need a single endpoint that supports GET and POST. For example, in the notes app, you may want to use GET /myendpoint/notes?lastUpdated=timestamp to retrieve the records (step 2 in the synchronization process) and POST /myendpoint/notes (with a JSON payload) to update the notes (step 1 of the synchronization process). To implement this backend in AWS, you can use a combination of Amazon Cognito, Amazon API Gateway, and AWS Lambda to implement the logic. You can use Amazon RDS or Amazon DynamoDB to implement the database.

In addition to Amazon RDS and Amazon DynamoDB, you can use any data source that AWS Lambda can access. For example, you can use on-premises databases and other data sources (such as Salesforce, for example) through the same mechanism. You could implement the entire sync interface without Amazon API Gateway and AWS Lambda by directly accessing the database resources. Using Amazon API Gateway and AWS Lambda provides a few advantages:

  1. You can expose only the data that your mobile application needs and lock down the underlying database.
  2. You can implement controls at the API Gateway and Lambda interfaces, including authentication and rate limiting.
  3. You can manage different versions of the database model depending on the version of your mobile client.

Next steps

Data synchronization is complex.  Android has made the mechanics of setting up a sync service relatively straightforward, leaving you to concentrate on the actual code.

In the next post, we will describe the process of building a content provider and registering it with your application. We will also show how to build a content provider suitable for data synchronization.