Data modeling is the process of designing how an application stores data in a given database. With a NoSQL database such as DynamoDB, data modeling is different than modeling with a relational database. A relational database is built for flexibility and can be a great fit for analytical applications. In relational data modeling, you start with your entities first. When you have a normalized relational model, you can satisfy any query pattern you need in your application.

NoSQL (nonrelational) databases are designed for speed and scale—not flexibility. Though the performance of your relational database may degrade as you scale up, horizontally scaling databases such as DynamoDB provides consistent performance at any scale. Some DynamoDB users have tables that are larger than 100 TB, and the read and write performance of their tables is the same as when the tables were smaller than 1 GB in size.

Achieving best results with a NoSQL database such as DynamoDB requires a shift in thinking from the typical relational database. Use the following best practices when modeling data with DynamoDB.

1. Focus on access patterns
When doing any kind of data modeling, you will start with an entity-relationship diagram that describes the different objects (or entities) in your application and how they are connected (or the relationships between your entities).

In relational databases, you will put your entities directly into tables and specify relationships using foreign keys. After you have defined your data tables, a relational database provides a flexible query language to return data in the shape you need.

In DynamoDB, you think about access patterns before modeling your table. NoSQL databases are focused on speed, not flexibility. You first ask how you will access your data, and then model your data in the shape it will be accessed.

Before designing your DynamoDB table, document every need you have for reading and writing data in your application. Be thorough and think about all the flows in your application because you are going to optimize your table for your access patterns.

2. Optimize for the number of requests to DynamoDB
After you have documented your application’s access pattern needs, you are ready to design your table. You should design your table to minimize the number of requests to DynamoDB for each access pattern. Ideally, each access pattern should require only a single request to DynamoDB because network requests are slow, and this limits the number of network requests you will make in your application.

To optimize for the number of requests to DynamoDB, you need to understand some core concepts:

Primary keys
Secondary indexes
Transactions

3. Don’t fake a relational model
People new to DynamoDB often try to implement a relational model on top of nonrelational DynamoDB. If you try to do this, you will lose most of the benefits of DynamoDB.

The most common anti-patterns (ineffective responses to recurring problems) that people try with DynamoDB are:

  •  Normalization: In a relational database, you normalize your data to reduce data redundancy and storage space, and then use joins to combine multiple different tables. However, joins at scale are slow and expensive. DynamoDB does not allow for joins because they slow down as your table grows.
  • One data type per table: Your DynamoDB table will often include different types of data in a single table. In our example, we have User, Game, and UserGameMapping entities in a single table. In a relational database, this would be modeled as three different tables.
  • Too many secondary indexes: People often try to create a secondary index for each additional access pattern they need. DynamoDB is schemaless, and this applies to your indexes, too. Use the flexibility in your attributes to reuse a single secondary index across multiple data types in your table. This is called index overloading.

In the steps below, we will build our entity-relationship diagram and map out our access patterns up front. These should always be your first steps when using DynamoDB. Then, in the modules that follow, we implement these access patterns in the table design.

Time to Complete Module: 20 Minutes


  • Step 1: Build your entity-relationship diagram

    The first step of any data modeling exercise is to build a diagram to show the entities in your application and how they relate to each other.

    In our application, we have the following entities:
    • User
    • Game
    • UserGameMapping

    A User entity represents a user in our application. A user can create multiple Game entities, and the creator of a game will determine which map is played and when the game starts. A User can create multiple Game entities, so there is a one-to-many relationship between Users and Games.

    Finally, a Game contains multiple Users and a User can play in multiple different Games over time. Thus, there is a many-to-many relationship between Users and Games. We can represent this relationship with the UserGameMapping entity.

    With these entities and relationships in mind, our entity-relationship diagram is shown below.

    Module2-step1

    (Click to enlarge)

    Module2-step1
  • Step 2: Consider user profile access patterns

    The users of our gaming application need to create user profiles. These profiles include data such as a user name, avatar, game statistics, and other information about each user. The game displays these user profiles when a user logs in. Other users can view the profile of a user to review their game statistics and other details.

    As a user plays games, the game statistics are updated to reflect the number of games the user has played, the number of games the user has won, and the number of kills by the user.

    Based on this information, we have three access patterns:

    •  Create user profile (Write)
    •  Update user profile (Write)
    • Get user profile (Read)
  • Step 3: Consider pregame access patterns

    Our game is an online multiplayer game similar to Fortnite or PUBG. Players can create a game at a particular map, and other players can join the game. When 50 players have joined the game, the game starts and no additional players can join.

    When searching for games to join, players may want to play a particular map. Other players won’t care about the map and will want to browse open games across all maps.

    Based on this information, we have the following seven access patterns:

    • Create game (Write)
    • Find open games (Read)
    • Find open games by map (Read)
    • View game (Read)
    • View users in game (Read)
    • Join game for a user (Write)
    • Start game (Write)
  • Step 4: In-game and post-game access patterns

    Finally, let’s consider the access patterns during and after a game.

    During the game, players try to defeat other players with the goal of being the last player standing. Our application tracks how many kills each player has during a game as well as the amount of time a player survives in a game. If a player is one of the last three surviving in a game, the player receives a gold, silver, or bronze medal for the game.

    Later, players may want to review past games they’ve played or that other players have played.

    Based on this information, we have three access patterns:

    • Update game for user (Write)
    • Update game (Write)
    • Find all past games for a user (Read)

    We have now mapped all access patterns for the gaming application. In the following modules, we implement these access patterns by using DynamoDB.

    Note that the planning phase might take a few iterations. Start with a general idea of the access patterns your application needs. Map the primary key, secondary indexes, and attributes in your table. Go back to the beginning and make sure all of your access patterns are satisfied. When you are confident the planning phase is complete, move forward with implementation.