What’s the Difference Between a Graph Database and a Relational Database?


What’s the difference between a graph database and a relational database?

Both graph databases and relational databases store related data items with relationships, however, they represent the data relationships very differently. Relational databases store data in a tabular format with rows and columns. All data is also stored in tables, and relationships between data are stored as represented references back to the original table (a.k.a foreign keys). At run time, a relational database uses JOIN statements to explicitly resolve these references. While most relational databases can do this efficiently at certain scales these operations become inefficient when a large or unknown number of these references need to be processed, such as when you want to find related through an unknown number of connections, such as finding out how two people are related in a social network.

In contrast, a graph database stores data as a network of entities and relationships. Graph databases explicitly store both the entity and relationship data instead of storing data as references. At run time a graph database leverages mathematical graph theory to efficiently perform operations on entities and relationships. Since the relationships between entities are explicitly stored instead of calculated graph databases are more efficient at querying and memory management for use cases with complex data interconnections, which can improve application performance significantly.


Read about relational databases »
Read about graph databases
 

Data model: graph database vs. relational database

Both graph and relational databases store information and represent relationships between data. However, the relational model prioritizes data entities while the graph model prioritizes relationships between the entities.

Relational database model

Relational database use data tables that organize information into rows and columns. Columns hold specific attributes of the data entity, while rows represent the individual data records. 

The fixed schema of relational databases requires that you outline relationships between tables upfront with primary and foreign keys. 

Example

Consider a social media application with customer profiles that can be friends with each other. A typical model would need two tables to model the data.
The customer table could look like this:


ID

Name

Location

C1

Alejandro

USA

C2

Ana

USA

C3

Kwaku

USA

C4

Pat

USA

The friends table could look like this:


Customer ID

Friend ID

C1

C2

C1

C3

C2

C4

C2

C1

C3

C1

C3

C4

At query time, If you wanted to answer a question such as “What are the name(s) of Alejandro’s friends?” the database engine would first find the row in the Customer table for Alejandro.
 


ID

Name

Location

C1

Alejandro

USA

Next, the engine would create a union of all the rows in the friends table for Alejandro using his ID
 


ID

Name

Location

Customer ID

Friend ID

C1

Alejandro

USA

C1

C2

C1

Alejandro

USA

C1

C3

Now, for each row the engine would create a union back to the Customer table for each Friend ID
 


ID

Name

Location

Customer ID

Friend ID

ID

Name

Location

C1

Alejandro

USA

C1

C2

C2

Ana

USA

C1

Alejandro

USA

C1

C3

C3

Kwaku

USA

Finally, the engine returns the names of his friends.
 


Name

Ana

Kwaku

As we can see, as we use the connections in our relational data we end up building out a large data structure to represent the information we are looking to retrieve. While relational databases have optimizations in them to minimize the impact of these structures, as the number of joins get larger the amount of data that is required increases significantly, reducing performance and increasing memory usage.

Graph database model

On the other hand, a graph database uses a graph structure with attributes, relationships, and objects to represent data. Nodes are objects, edges demonstrate the relationship between those nodes, and properties describe the attributes of the nodes and edges. This dynamic structure makes a graph database useful for connected data representation. It offers more flexibility regarding relationships and data types.

Example

Using the same example social network data as above, our graph database would store the data using 3 nodes, each with 4 properties, and 2 edges.

Now, let’s take a look at how a graph database processes the query “What are the name(s) of Alejandro’s friends?”.

First, we look for our Customer node representing Alejandro (highlighted below).

Next, we will traverse, or move across our friends edges. Traversing in a graph database is similar to performing a JOIN in a relational database, except, unless explicitly requested, information from earlier in the query is not retained. In our example below, only the two friends edges are retained in memory.

Third, we continue our traversal to the adjacent nodes.

Finally, the engine returns the names of his friends.
 


Name

Ana

Kwaku

As we can see, both engines are capable of returning the same information, however when traversing many connections the explicit storage of relationships in a graph database allows it to more efficiently processes this request. While this advantage is not significant for simple queries, such as the one shown here, this optimization, along with the structure of graph query languages, can significantly reduce the complexity and memory usage for processing questions that require many, or an unknown number, of these relationship traversals.

Key differences: graph database vs. relational database

Beyond their differing data models, relational and graph databases have many differences that set them apart in function and utility.

Querying

Graph databases use custom query languages that are optimized to find and retrieve connected data rapidly. These languages, such as TinkerPop Gremlin, openCypher, and SPARQL are purpose built simplify writing queries that leverage complex data interconnections such as those required for operations such as recursive data access, path finding, and graph algorithms.

In contrast, relational databases employ SQL to retrieve and manipulate data. With SQL, users can perform various types of queries—such as SELECT, INSERT, UPDATE, and DELETE—on tables. Relational databases excel in handling structured data with well-defined relationships between tables. They're particularly effective for performing complex filtering, aggregations, and joins across multiple tables.

Performance

Graph databases store both objects and relationships as data and use indexes to efficiently traverse between related entities. Since graph databases store relationships as data, the database can rapidly navigate between entities without needing to dynamically calculate these connections The direct connection between nodes allows for immediate access, so you can rapidly query and trace relationships. These features make graph databases very efficient.

Alternatively, relational databases use index lookups and dynamically calculated joins to identify relationships between entities. You can join multiple tables, but it’s time-consuming as the system has to scan larger indices over more data. Due to this, a relational database doesn’t offer the same performance as a graph database for use cases where large numbers of connections are required to retrieve the required data.

Ease of use

Graph databases are relationship-centric, which makes them easy to work with when you’re using connected data. These databases excel at multi-hop queries, where you traverse paths with multiple relationships. You can also use graph query languages like SPARQL, Gremlin, or openCypher to express queries that explore interconnected data with a simple, graph specific syntax.

Relational databases use SQL, which can feel unnatural when you manage multi-hop queries. If a query has multiple joins and spans over nested subqueries, the SQL becomes challenging to write. If you aren’t careful, this can easily translate into bulky queries that are hard to read and maintain.

That said, relational databases are mature and popular in various use cases. There are several tools and resources as well as community support that you can access to optimize your system.

When to use: graph database vs. relational database

Graph and relational databases have many effective use cases. As they have different data models and several core distinctions, they excel in different areas.

Graph database

Graph databases provide a flexible schema that allows for dynamic changes and adaptations to data. The focus on data relationships makes it useful in analytics, semantic searches, or recommendation engines. A graph database is the better choice in these scenarios:

  • You’re working with data that has complex relationships, like in social networks, fraud detection, knowledge graphs, security graphs, or personalized recommendation engines
  • You need an evolving schema, as you can modify edges, nodes, and properties without disturbing the rest of the database structure
  • You are working with interconnected data and need to conduct multiple, or an unknown number of, hops between relationships (friend-of-friend type queries)

Graph databases are flexible, scalable, dynamic, and excellent at showing relationships between data.

Relational database

Relational databases offer a structured schema with great support for data integrity. A relational database is the better choice in these scenarios:

  • You need ACID compliance and high levels of data integrity and consistency, like in financial transactions
  • You’re working with highly structured data that fits well into the tabular data model, like in enterprise resource management
  • Your data has limited relationships

Summary of differences: relational database vs. graph database

 
Relational Databases

Graph Databases

Model

Tabular with rows and columns.

Interconnected nodes with data represented as nodes and edges

Operations

SQL operations like create, read, update, and delete (CRUD).

Operations include CRUD and graph traversal operations

Performance

Relational databases face complex queries when traversing relationships that can slow down performance.

A graph database excels in representing and querying relationships between connected data.

Ease of Use

Relational databases work well with large datasets and structured data. They struggle when it comes to multi-hop queries.

A graph database is easy to use when dealing with relationship-centric data. Using a graph query language, you can rapidly query multiple hop data.
     

How can AWS help with your relational and graph database requirements?

Amazon Web Services (AWS) has solutions for both relational and graph database use cases.

 

Relational Databases

Amazon Relational Database Service (Amazon RDS) is a managed service that makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity, while managing time-consuming database administration tasks. Amazon RDS supports several database engines, like these:

Amazon Aurora is a modern relational database service offering performance and high availability at scale, fully open-source MySQL- and PostgreSQL-compatible editions. Aurora is also a fully managed service that automates time-consuming administration tasks such as hardware provisioning, database setup, patching, and backups while providing the security, availability, and reliability of commercial databases at one-tenth of the cost.

Graph Databases

Amazon Neptune is a purpose-built, high-performance graph database engine. It’s optimized to store billions of relationships and query the graph with milliseconds of latency.
Neptune supports the popular graph models—property graph and W3C's Resource Description Framework (RDF). It also supports query languages like Gremlin and SPARQL, so you can build queries that navigate highly connected datasets.
Neptune offers several features:

  • It’s highly available with read replicas, point-in-time recovery, continuous backup, and replication across Availability Zones.
  • It’s secure with support for encryption at rest.
  • It’s fully managed. So, you no longer need to worry about database management tasks such as hardware provisioning, software patching, setup, configuration, or backups.

Get started with graph and relational databases on AWS by creating an account today.