AWS for Games Blog

AWS for Games Cohort Modeler: Graph Data Model

This is the second blog of the series introducing the AWS for Games Cohort Graph. You can read the introductory blog here.

In the first part of this blog series, we covered what the AWS for Games Cohort Modeler is, what challenges it solves, and provided both a high-level architecture and a code example to experiment with. In this blog post, we’ll dive deeper into the data model and demonstrate how customers can adapt the example schema to the specific needs of their studio.

Understanding the specifics of the data model is critical to the process of adapting and extending the AWS for Games Cohort Modeler for your specific use case. Whether you’re uncovering toxic community members, discovering the buying patterns in your player community, or analyzing how players consume your content, building the right data model to aggregate and highlight the right data is key.

Why a graph database?

To model player behaviors, look at the relationships between players and the activities they engage in. In a graph database, the way you define a relationship between two entities is based upon the traversal from a vertex via an edge to a destination. Vertexes and edges are attribute stores. In contrast, relationships that would have to be inferred or represented with keys in a traditional relational database schema are built into the graph data model. Therefore, it makes sense to track a relationship-based data model using a relationship-based database.

There are also a number of simple data science techniques you can employ with a graph data model to identify new behavioral insights without having to use machine learning, which requires significantly more data to become effective.

What’s a vertex?

A vertex represents a data object in the database. For example, a vertex can represent a player, an in-game activity like chatting, griefing, or healing other players, or even a marketing campaign that a player can interact with. You can attach properties to vertexes, so if a customer wants to quickly and easily refer to how many times a player has been banned, how much they’ve spent on in-app purchases (IAP) in total, or how much total time in game they’ve spent, you can load that data in as an attribute on the vertex and have it available with a simple query.

Cohort Modeler

What’s an edge?

An edge is the data structure that defines a relationship between two vertexes. If you have two vertexes representing players with an edge between them, that edge describes how those two players have interacted. Just like vertexes, edges can also contain additional properties, so you can aggregate the details of each relationship on the edge between them – whether you’re counting the number of times player X has messaged player Y, grouped with player Z, or griefed player A.

You can imagine edges as either unidirectional or bidirectional. In practice, bidirectional edges are actually two unidirectional edges, each one starting from alternate vertexes. To understand not only how often player X messages player Y, but how often player Y messages player X, you’ll need to evaluate properties on both edges between them.

Cohort Modeler

The AWS for Games Cohort Modeler – Example Data Model:

The AWS for Games Cohort Modeler uses these graph primitive data type to build relationships with the following vertexes:

  • Player – a player that has played one or more games from your studio
  • Campaign – a marketing campaign that you have conducted to engage and attract new players
  • Activity – a behavior that a studio wants to track

Between these vertexes, the Cohort Modeler defines specific relationships with the following edges:

  • Player Interactions – Interactions from a single player vertex to another player vertex
  • Marketing Interactions – Interactions from a marketing campaign to a player vertex
  • Engaged In – Interactions from a player vertex to an activity vertex


This sample diagram is a visualization of the data model:

Cohort Modeler

The player vertex

Player vertexes track a player’s identity and their aggregated behavioral traits (which, themselves in aggregate, define their cohort). Edges between players define the strength of their in-game relationships.

  • In the diagram, player1 (a vertex of type player) has established an edge, via in-game interactions, to player2 (another vertex of type player).
  • Each interaction initiated by player1 to player2 will add metadata to the edge between them, and each interaction will create an accumulative value for the strength of the relationship and the activities they engage in.
  • player2 will also establish an edge back to player1, describing the weight of the interactions they’ve initiated with player1. This edge would look identical to player1‘s edge to player2, aside from denoting how these relationships differ (if player1 always sends group invites, player2 will have more group accept increments, for instance).

Cohort Modeler

Player-initiated edges

Player-initiated edges are the edges that can emanate from a player to another vertex. These are described as follows:

  • The playerInteraction edge describes all of the means by which player1 has initiated interactions with player2 – there may be an attribute pair for ‘groupInvite.sent,’ for ‘whispers.sent,’ ‘groupInvite.accepted,’ and so forth.
  • The engagedIn edge notes which actions a player has engaged in, and the aggregate data stored on that edge indicates how many times the player has engaged in the action.
  • The customerMarketingInteraction edge describes when a player interacts with (and potentially responds to) marketing campaign materials they’ve been exposed to.

Properties that a player vertex and player-player edges could contain include:

Cohort Modeler

The activity vertex

Activity vertexes achieve two goals:


  1. Activity vertexes describe a kind of activity that a player can engage in within the game’s play space.
  2. Activity vertexes tell the system how that activity is connoted behaviorally.

The system can infer how to change a player’s personality profile based on their participation in an activity.

  • player1 establishes an edge, via in-game interactions, to activity1 (a vertex of type activity).
  • In this example, the activity metadata edge is simpler than the interaction metadata edge, since all it’s doing is counting the number of times player1 has engaged in activity1.

Cohort Modeler

An example of the properties an activity vertex and player-activity edges could contain include:

Cohort Modeler

When you define an emergent attribute such as ea_altruism on an activity vertex, it is a weight that is used in multiplicatively with the edge count to determine the value of the emergent attribute for the player in question. For example, if player1 has an engaged in edge with count 5 to activity1 vertex with ea_altruism value 2, we would increment ea_altruism on the player1 vertex by 10. All emergent attributes defined in an activity follow this calculation. In the case that an activity affects multiple emergent attributes, you can define multiple emergent attributes.

Defining the effects of a player taking part in an activity in the data model has several advantages:

  1. Easily changeable and recalculable values.
  2. Changes to model value can happen on the fly rather than waiting for a recompile of your code.

The campaign vertex

Campaign vertexes track the relationship between studio-sponsored events such as in game events, email marketing campaigns, or other studio-to-customer interactions.

Cohort Modeler

Examples of properties for both the campaign vertex and edges to and from the vertex contain:

Cohort Modeler

Monitoring and cataloging data

Data needs to be attached to the activities a player can engage in, both alone and with others, so that a cohort can be built and identified over time. The following example shows how to build a personality model and monitor activity of community members’ distinct profiles.

Example: detailed personality evaluation

The code example defines four personality traits that each activity a player undertakes may correlate to, building their profile as they interact with other players in the game.

Sample behavioral profile

When you build your own profiles, ensure that there are concrete, definitive boundaries between them so that specific activities can be rated effectively.

  • Altruism: Engaging in positive behaviors on behalf of other players without any extrinsic reward.
  • Duty: Engaging in positive behaviors for oneself or on behalf of other players while gaining extrinsic rewards.
  • Mischief: Engaging in negative behaviors that only impact oneself or whose effects remain entirely within the game world (cheats, wall hacks, duping, etc.)
  • Malice: Engaging in negative behaviors whose impacts carry over into the real world (stalking, abuse, criminal activity)

It’s important to note that both player and activity vertexes are defined with these attributes – for instance, trying a disallowed name when creating a character might be rated as low-level mischief by your community team, so the system knows how to modify a player vertex when the event is reported. In this way, players can be more distinctively differentiated than by a simple plus/minus reputation rating, and predictive analysis can be more effective at identifying different kinds of cohorts.

Best Practices

Customers should follow these best practices to maximize performance:

  • Use small, targeted queries – Graph queries are transactional in nature and should avoid large analytical queries.
  • Precompute or pre-aggregate data – Data being placed onto the graph should be precomputed or pre-aggregated before being injected.
  • Augment and enrich upstream data – The Cohort Modeler does not replace your existing analytics capabilities but augments and enriches raw data with relationship-based insights.

In this blog, we defined graph database primitive data types, how they are applied within the AWS for Games Cohort Modeler, and finished up with some best practices how to create a model for your game.

When defining your own data model, a couple of questions to keep in mind:

  • Does your game have activities that define positive or negative impacts on your game or game community?
  • What data are you already collecting that could be used to define a player behavioral profile? What data are you not collecting?

Visit the GitHub repository and modify it to fit your specific use case.