In the previous module, we defined the game application’s access patterns. In this module, we design the primary key for the DynamoDB table and enable the core access patterns.

Time to Complete Module: 20 Minutes


When designing the primary key for a DynamoDB table, keep the following best practices in mind:

  • Start with the different entities in your table. If you are storing multiple different types of data in a single table—such as employees, departments, customers, and orders—be sure your primary key has a way to distinctly identify each entity and enable core actions on individual items.
  • Use prefixes to distinguish between entity types. Using prefixes to distinguish between entity types can prevent collisions and assist in querying. For example, if you have both customers and employees in the same table, the primary key for a customer could be CUSTOMER#<CUSTOMERID>, and the primary key for an employee could be EMPLOYEE#<EMPLOYEEID>.
  • Focus on single-item actions first, and then add multiple-item actions if possible. For a primary key, it’s important that you can satisfy the read and write options on a single item by using the single-item APIs: GetItem, PutItem, UpdateItem, and DeleteItem. You may also be able to satisfy your multiple-item read patterns with the primary key by using Query. If not, you can add a secondary index to handle the Query use cases.

With these best practices in mind, let’s design the primary key for the game application’s table and perform some basic actions.


  • Step 1. Design the primary key

    Let’s consider the different entities, as suggested in the preceding introduction. In the gaming application, we have the following entities:

    • User
    • Game
    • UserGameMapping

    A UserGameMapping is a record that indicates a user joined a game. There is a many-to-many relationship between User and Game.

    Having a many-to-many mapping is usually an indication that you want to satisfy two Query patterns, and this gaming application is no exception. We have an access pattern that needs to find all users that have joined a game as well as another pattern to find all games that a user has played.

    If your data model has multiple entities with relationships among them, you generally use a composite primary key with both HASH and RANGE values. The composite primary key gives us the Query ability on the HASH key to satisfy one of the query patterns we need. In the DynamoDB documentation, the partition key is called HASH and the sort key is called RANGE, and in this guide we use the API terminology interchangeably and especially when we discuss the code or DynamoDB JSON wire protocol format.

    The other two data entities—User and Game—don’t have a natural property for the RANGE value because the access patterns on a User or Game are a key-value lookup. Because a RANGE value is required, we can provide a filler value for the RANGE key.

    With this in mind, let’s use the following pattern for HASH and RANGE values for each entity type.

    Entity HASH RANGE
    User USER#<USERNAME> #METADATA#<USERNAME>
    Game GAME#<GAME_ID> #METADATA#<GAME_ID>
    UserGameMapping GAME#<GAME_ID> USER#<USERNAME>

    Let’s walk through the preceding table.

    For the User entity, the HASH value is USER#<USERNAME>. Notice that we’re using a prefix to identify the entity and prevent any possible collisions across entity types.

    For the RANGE value on the User entity, we’re using a static prefix of #METADATA# followed by the USERNAME value. For the RANGE value, it’s important that we have a value that is known, such as the USERNAME. This allows for single-item actions such as GetItem, PutItem, and DeleteItem.

    However, we also want a RANGE value with different values across different User entities to enable even partitioning if we use this column as a HASH key for an index. For that reason, we append the USERNAME.

    The Game entity has a primary key design that is similar to the User entity’s design. It uses a different prefix (GAME#) and a GAME_ID instead of a USERNAME, but the principles are the same.

    Finally, the UserGameMapping uses the same HASH key as the Game entity. This allows us to fetch not only the metadata for a Game but also all the users in a Game in a single query. We then use the User entity for the RANGE key on the UserGameMapping to identify which user has joined a specific game.

    In the next step, we create a table with this primary key design. 

  • Step 2: Create a table

    Now that we have designed the primary key, let’s create a table.

    The code you downloaded in Step 3 of Module 1 includes a Python script in the scripts/ directory named create_table.py. The Python script’s contents follow.

    import boto3
    
    dynamodb = boto3.client('dynamodb')
    
    try:
        dynamodb.create_table(
            TableName='battle-royale',
            AttributeDefinitions=[
                {
                    "AttributeName": "PK",
                    "AttributeType": "S"
                },
                {
                    "AttributeName": "SK",
                    "AttributeType": "S"
                }
            ],
            KeySchema=[
                {
                    "AttributeName": "PK",
                    "KeyType": "HASH"
                },
                {
                    "AttributeName": "SK",
                    "KeyType": "RANGE"
                }
            ],
            ProvisionedThroughput={
                "ReadCapacityUnits": 1,
                "WriteCapacityUnits": 1
            }
        )
        print("Table created successfully.")
    except Exception as e:
        print("Could not create table. Error:")
        print(e)
    

    The preceding script uses the CreateTable operation using Boto 3, the AWS SDK for Python. The operation declares two attribute definitions, which are typed attributes to be used in the primary key. Though DynamoDB is schemaless, you must declare the names and types of attributes that are used for primary keys. The attributes must be included on every item that is written to the table and thus must be specified as you are creating a table.

    Because we’re storing different entities in a single table, we can’t use primary key attribute names such as UserId. The attribute means something different based on the type of entity being stored. For example, the primary key for a user might be its USERNAME, and the primary key for a game might be its GAMEID. Accordingly, we use generic names for the attributes, such as PK (for partition key) and SK (for sort key).

    After configuring the attributes in the key schema, we specify the provisioned throughput for the table. DynamoDB has two capacity modes: provisioned and on-demand. In provisioned capacity mode, you specify exactly the amount of read and write throughput you want. You pay for this capacity whether you use it or not.

    In DynamoDB on-demand capacity mode, you can pay per request. The cost per request is slightly higher than if you were to use provisioned throughput fully, but you don’t have to spend time doing capacity planning or worrying about getting throttled. On-demand mode works great for spiky or unpredictable workloads. We’re using provisioned capacity mode in this lab because it fits within the DynamoDB free tier.

    To create the table, run the Python script with the following command.

    python scripts/create_table.py

    The script should return this message: “Table created successfully.”

    In the next step, we bulk-load some example data into the table. 

  • Step 3: Bulk-load data into the table

    In this step, we bulk-load some data into the DynamoDB we created in the preceding step. This means that in succeeding steps, we will have sample data to use.

    In the scripts/ directory, you will find a file called items.json. This file contains 835 example items that were randomly generated for this lab. These items include User, Game, and UserGameMapping entities. Open the file if you want to see some of the example items.

    The scripts/ directory also has a file called bulk_load_table.py that reads the items in the items.json file and bulk-writes them to the DynamoDB table. That file’s contents follow.

    import json
    
    import boto3
    
    dynamodb = boto3.resource('dynamodb')
    table = dynamodb.Table('battle-royale')
    
    items = []
    
    with open('scripts/items.json', 'r') as f:
        for row in f:
            items.append(json.loads(row))
    
    with table.batch_writer() as batch:
        for item in items:
            batch.put_item(Item=item)
    

    In this script, rather than using the low-level client in Boto 3, we use a higher-level Resource object. Resource objects provide an easier interface for using the AWS APIs. The Resource object is useful in this situation because it batches our requests. The BatchWriteItem operation accepts as many as 25 items in a single request. The Resource object handles that batching for us rather than making us divide our data into requests of 25 or fewer items.

    Run the bulk_load_table.py script and load your table with data by running the following command in the terminal.

    python scripts/bulk_load_table.py

    You can ensure that all your data was loaded into the table by running a Scan operation and getting the count.

    aws dynamodb scan \
     --table-name battle-royale \
     --select COUNT

    This should display the following results.

    {
        "Count": 835, 
        "ScannedCount": 835, 
        "ConsumedCapacity": null
    }
    

    You should see a Count of 835, indicating that all of your items were loaded successfully.

    In the next step, we show how to retrieve multiple entity types in a single request, which can reduce the total network requests you make in your application and enhance application performance.

  • Step 4: Retrieve multiple entity types in a single request

    As we said in the previous module, you should optimize DynamoDB tables for the number of requests it receives. We also mentioned that DynamoDB does not have joins that a relational database has. Instead, you design your table to allow for join-like behavior in your requests.

    In this step, we retrieve multiple entity types in a single request. In the gaming application, we may want to fetch details about a game. These details include information about the game itself such as the time it started, time it ended, who placed in it, and details about the users that played in the game.

    This request spans two entity types: the Game entity and the UserGameMapping entity. However, this doesn’t mean we need to make multiple requests.

    In the code you downloaded, a fetch_game_and_players.py script is in the application/ directory. This script shows how you can structure your code to retrieve both the Game entity and the UserGameMapping entity for the game in a single request.

    The following code composes the fetch_game_and_players.py script.

    import boto3
    
    from entities import Game, UserGameMapping
    
    dynamodb = boto3.client('dynamodb')
    
    GAME_ID = "3d4285f0-e52b-401a-a59b-112b38c4a26b"
    
    
    def fetch_game_and_users(game_id):
        resp = dynamodb.query(
            TableName='battle-royale',
            KeyConditionExpression="PK = :pk AND SK BETWEEN :metadata AND :users",
            ExpressionAttributeValues={
                ":pk": { "S": "GAME#{}".format(game_id) },
                ":metadata": { "S": "#METADATA#{}".format(game_id) },
                ":users": { "S": "USER$" },
            },
            ScanIndexForward=True
        )
    
        game = Game(resp['Items'][0])
        game.users = [UserGameMapping(item) for item in resp['Items'][1:]]
    
        return game
    
    
    game = fetch_game_and_users(GAME_ID)
    
    print(game)
    for user in game.users:
        print(user)
    

    At the beginning of this script, we import the Boto 3 library and some simple classes to represent the objects in our application code. You can see the definitions for those entities in the application/entities.py file.

    The real work of the script happens in the fetch_game_and_users function that’s defined in the module. This is similar to a function you would define in your application to be used by any endpoints that need this data.

    The fetch_game_and_users function does a few things. First, it makes a Query request to DynamoDB. This Query uses a PK of GAME#<GameId>. Then, it requests any entities where the sort key is between #METADATA#<GameId> and USER$. This grabs the Game entity, whose sort key is #METADATA#<GameId>, and all UserGameMappings, entities, whose keys start with USER#. Sort keys of the string type are sorted by ASCII character codes. The dollar sign ($) comes directly after the pound sign (#) in ASCII, so this ensures that we will get all mappings in the UserGameMapping entity.

    When we receive a response, we assemble our data entities into objects known by our application. We know that the first entity returned is the Game entity, so we create a Game object from the entity. For the remaining entities, we create a UserGameMapping object for each entity and then attach the array of users to the Game object.

    The end of the script shows the usage of the function and prints out the resulting objects. You can run the script in your terminal with the following command.

    python application/fetch_game_and_players.py

    The script should print the Game object and all UserGameMapping objects to the console.

    Game<3d4285f0-e52b-401a-a59b-112b38c4a26b -- Green Grasslands>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- branchmichael>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- deanmcclure>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- emccoy>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- emma83>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- iherrera>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- jeremyjohnson>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- lisabaker>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- maryharris>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- mayrebecca>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- meghanhernandez>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- nruiz>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- pboyd>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- richardbowman>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- roberthill>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- robertwood>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- victoriapatrick>
    UserGameMapping<3d4285f0-e52b-401a-a59b-112b38c4a26b -- waltervargas>
    

    This script shows how you can model your table and write your queries to retrieve multiple entity types in a single DynamoDB request. In a relational database, you use joins to retrieve multiple entity types from different tables in a single request. With DynamoDB, you specifically model your data, so that entities you should access together are located next to each other in a single table. This approach replaces the need for joins in a typical relational database and keeps your application high-performing as you scale up.


    In this module, we designed a primary key and created a table. Then, we bulk-loaded data into the table and saw how to query for multiple entity types in a single request.

    With the current primary key design, we can satisfy the following access patterns:

    • Create user profile (Write)
    • Update user profile (Write)
    • Get user profile (Read)
    • Create game (Write)
    • View game (Read)
    • Join game for a user (Write)
    • Start game (Write)
    • Update game for a user (Write)
    • Update game (Write)

    In the next module, we add a secondary index and learn about the sparse index technique. Secondary indexes allow you to support additional access patterns on your DynamoDB table.