Data modeling is the process of designing how an application stores data in a given database. With a NoSQL database such as DynamoDB, data modeling is different than modeling with a relational database. A relational database is built for flexibility and can be a great fit for analytical applications. In relational data modeling, you start with your entities first. When you have a normalized relational model, you can satisfy any query pattern you need in your application.

NoSQL (nonrelational) databases are designed for speed and scale—not flexibility. Though the performance of your relational database may degrade as you scale up, horizontally scaling databases such as DynamoDB provides consistent performance at any scale. Some DynamoDB users have tables that are larger than 100 TB, and the read and write performance of their tables is the same as when the tables were smaller than 1 GB in size.

Achieving best results with a NoSQL database such as DynamoDB requires a shift in thinking from the typical relational database. Use the following best practices when modeling data with DynamoDB.

1. Focus on access patterns

When doing any kind of data modeling, you will start with an entity-relationship diagram that describes the different objects (or entities) in your application and how they are connected (or the relationships between your entities).

In relational databases, you will put your entities directly into tables and specify relationships using foreign keys. After you have implemented your data tables, a relational database provides a flexible query language to return data in the shape you need.

In DynamoDB, you think about access patterns before modeling your table. NoSQL databases are focused on speed, not flexibility. You first ask how you will access your data, and then model your data in the shape it will be accessed.

Before designing your DynamoDB table, document every need you have for reading and writing data in your application. Be thorough and think about all the flows in your application because you are going to optimize your table for your access patterns.

2. Optimize for number of requests to DynamoDB

After you have documented your application’s access pattern needs, you are ready to design your table. You should design your table to minimize the number of requests to DynamoDB for each access pattern. Ideally, each access pattern should require only a single request to DynamoDB because network requests are slow, and this limits the number of network requests you will make in your application.

To optimize for the number of requests to DynamoDB, you need to understand some core concepts:

3. Don’t fake a relational model

People new to DynamoDB often try to implement a relational model on top of nonrelational DynamoDB. If you try to do this, you will lose most of the benefits of DynamoDB.

The most common anti-patterns that people try with DynamoDB are:

  • Normalization: In a relational database, you normalize your data to reduce redundancy and storage space, and then use joins to combine multiple different tables. However joins at scale are slow and expensive. DynamoDB does not allow for joins because they slow down as your table grows.
  • One data type per table: Your DynamoDB table will often include different types of data in a single table. In our example, we’ll have User, Friend, Photo, and Reaction entities in a single table. In a relational database, this would be modeled as four separate tables.
  • Too many secondary indexes: People often try to create a secondary index for each additional access pattern they need. DynamoDB is schemaless, and this applies to your indexes too. Use the flexibility in your attributes to reuse a single secondary index across multiple data types in your table. This is called index overloading.

In the steps below, we will build our entity-relationship diagram and map out our access patterns up front. These should always be the first steps when using DynamoDB. Then, in the modules that follow, we’ll implement these access patterns in our table design.

Time to Complete Module: 20 Minutes


  • Step 1: Build your entity-relationship diagram

    The first step of any data modeling exercise is to build a diagram to show the entities in your application and how they relate to each other.

    In our application, we have the following entities:

    • User
    • Photo
    • Reaction
    • Friendship

    A User can have many Photos, and a Photo can have many Reactions. Finally, the Friendship entity represents a many-to-many relationship between Users, as a User can follow multiple Users and be followed by multiple other Users.

    With these entities and relationships in mind, our entity-relationship diagram is shown below.

    Module_2_Step_1

    (Click to enlarge)

    Module_2_Step_1
  • Step 2: Consider user profile access patterns

    Now that we have our entity-relationship diagram, consider the access patterns around our entities. Let’s start with users.

    The users of our mobile application will need to create user profiles. These profiles will include information such as a username, profile picture, location, current status, and interests for a given user.

    Users will be able to browse the profile of other users. A user may want to browse the profile of another user to see if the user is interesting to follow or simply to read some background on an existing friend.

    Over time, a user will want to update their profile to display a new status or to update their interests as they change.

    Based on this information, we have three access patterns:

    • Create user profile (Write)
    • Update user profile (Write)
    • Get user profile (Read)
       
  • Step 3: Consider pregame access patterns

    Now, let’s look at the access patterns around photos.

    Our mobile application allows users to upload and share photos with their friends, similar to Instagram or Snapchat. When users upload a photo, you will need to store information such as the time the photo was uploaded and the location of the file on your content delivery network (CDN).

    When users aren’t uploading photos, they will want to browse photos of their friends. If they visit a friend’s profile, they should see the photos for a user with the most recent photos showing first. If they really like a photo, they can ‘react’ to the photo using one of four predefined reactions -- a heart, a smiley face, a thumbs up, or a pair of sunglasses. Viewing a photo should display the current reactions for the photo.

    In this section, we have the following access patterns:

    • Upload photo for user (Write)
    • View recent photos for user (Read)
    • React to a photo (Write)
    • View photo and reactions (Read)
       
  • Step 4: In-game and post-game access patterns

    Finally, let’s consider the access patterns around friendship.

    Many popular mobile applications have a social network aspect. You can follow friends, view updates on your friends’ activities, and receive recommendations on other friends you may want to follow.

    In your application, a friendship is a one-way relationship, like Twitter. One user can choose to follow another user, and that user may choose to follow the user back. For our application, we will call the users that follow a user “followers”, and we will call the users that a user is following the “followed”.

    Based on this information, we have the following access patterns:

    • Follow user (Write)
    • View followers for user (Read)
    • View followed for user (Read)

    We have now mapped out all of our access patterns for our mobile application. In the following modules, we’ll implement these access patterns by using DynamoDB.

    Note that the planning phase may take a few iterations. Start with a general idea of the access patterns your application needs. Map out the primary key, secondary indexes, and attributes in your table. Go back to the beginning and make sure all of your access patterns are satisfied. Once you are confident the planning phase is complete, then move forward with implementation.