AWS Database Blog
Build a graph application with Amazon Neptune and AWS Amplify
More and more organizations are adopting graph databases for various use cases, such as legal entity lookup tools in the public sector, drug-drug interaction checkers in the healthcare sector, and customer insights and analytics tools in marketing.
If your application has relationships and connections, using a relational database is hard. But Amazon Neptune, a fully managed graph database, is purpose-built to store and navigate relationships. You can use Neptune to build popular graph applications such as knowledge graphs, identity graphs, and fraud graphs. We also recently released AWS CloudFormation templates of a sample chatbot application that utilizes knowledge graphs.
AWS Amplify is popular way for developers to build web applications, and you may want to combine the power of graph applications with the ease of building web applications. AWS AppSync makes it easy to develop GraphQL APIs, which provides an access layer to the data and builds a flexible backend using AWS Lambda to connect to Neptune.
In this post, we show you how to connect an Amplify application to Neptune.
Solution overview
The architecture diagram of this solution is as follows.
This solution uses Amplify to host the application, and all access to Neptune is handled by Lambda functions that are invoked by AWS AppSync. The front end is created with React, and the backend is powered by Lambda functions built with Node.js. This application uses Apache TinkerPop Gremlin to access the graph data in Neptune.
The following video demonstrates the UI that you can create by following this tutorial.
In the navigation pane of the UI, you can navigate to the following services and features:
- Dashboard – By entering information such as person name, product name, or affiliated academic society, you can retrieve information related to the input.
- Amazon Neptune – This link redirects you to Neptune on the AWS Management Console.
- Amazon SageMaker – This link redirects you to Amazon SageMaker on the console.
- Add Vertex/Edge – You can add vertex or edge data to Neptune.
- Visualizing Graph – You can visualize the graph data in Neptune. Choose a vertex on the graph data to display its properties.
The following table shows the graph schema vertexes and edges used for this solution.
Vertex | Description |
person |
Doctor or MR |
paper |
Paper authored by person |
product |
Medicine used by person |
conference |
Affiliated academic society |
institution |
Hospital, university, or company |
Edge | Source | Target |
knows |
person |
person |
affiliated_with |
person |
institution |
authored_by |
paper |
person |
belong_to |
person |
conference |
made_by |
product |
institution |
usage |
person |
product |
Prerequisites
June/2024: We’re currently updating this post/solution. Meanwhile, you can use this repo to deploy it using AWS CDK.
- An AWS account – You can create a new account if you don’t have one yet.
- AWS Region – This solution uses the Region
us-east-1
. - IAM role and Amazon S3 access permissions for bulk loading – This solution loads graph data into Neptune using the bulk loader. You need an AWS Identity and Access Management (IAM) role and Amazon Simple Storage Service (Amazon S3) VPC endpoint. For more information, see Prerequisites: IAM Role and Amazon S3 Access. You attach this role to Neptune in a later step.
Create an AWS Cloud9 environment
We start by creating an AWS Cloud9 environment.
- On the AWS Cloud9 console, create an environment with the following parameters:
- Instance type – m5.large
- Network (VPC) – VPC belongs to
us-east-1
- Subnet – Subnet belongs to
us-east-1a
- Next, copy and save the following script as
resize.sh
, which is used to modify the Amazon Elastic Block Store (Amazon EBS) volume size attached to AWS Cloud9.For more information about the preceding script, see Moving an environment and resizing or encrypting Amazon EBS volumes.
- Run the following commands to change the volume size to 20 GB.
Create Neptune resources
You can create Neptune resources via the AWS Command Line Interface (AWS CLI).
- Run the following commands to create a new Neptune DB cluster and instances.
- Add the IAM role that you created in the prerequisites section to Neptune.
- Add the following rule to the inbound rules in the security group of all of the database instances in the Neptune cluster, so that you can bulk load the graph data from AWS Cloud9 into Neptune in the next step.
- Type – Custom TCP
- Port range – 8182
- Source – Custom, with the security group of the AWS Cloud9 instance
Create an S3 bucket and load data into Neptune
In this step, you create an S3 bucket and upload graph data into the bucket.
- Create an S3 bucket in
us-east-1
and upload the following files.Because the solution uses Gremlin, the files are in the Gremlin load data format.The following code is the sample vertex data (vertex.csv
).The following code is the sample edge data (
edge.csv
). - In the AWS Cloud9 environment, enter the following command to bulk load the vertex data into Neptune.
- Enter the following command to verify the bulk loading was successful (
LOAD_COMPLETED
). - Repeat this procedure by changing the
source
to the S3 URI that stores theedge.csv
file to bulk load the edge data.
This completes loading the graph data into Neptune.
Build a graph application using Amplify
In this step, you use Amplify to build an application that interacts with Neptune. After you configure Amplify with AWS Cloud9, you add components such as a user authentication mechanism and backend Lambda functions that communicate with Neptune and a GraphQL API.
Configure Amplify with AWS Cloud9
Now that you’ve created an AWS Cloud9 environment, let’s configure Amplify with AWS Cloud9.
- Open the environment in your web browser again and run the following commands.When running
$ amplify configure
, you are asked to create a new IAM user to allow Amplify to deploy your application on your behalf. In the final step of the configuration, save the current configuration using the profile namedefault
. For more information about the configuration, see Configure the Amplify CLI. - Run the following commands to create your React application template.The create-react-app command is used to quickly create a template of your React application. After creating the template, you should have a new directory named
react-amplify-neptune-workshop
. Move to the directory and set up Amplify. - In the same directory (
react-amplify-neptune-workshop
), run$ amplify init
to initialize your new Amplify application using the default profile you’ve created earlier.Amplify automatically creates your application’s backend using AWS CloudFormation. The process takes a few minutes to complete.
- In your AWS Cloud9 environment, open another terminal and run the following command.
- On the menu bar, under Preview, choose Preview Running Application.You should see a preview of your React application on the right half of your AWS Cloud9 environment. Going forward, you can preview the application right after modifying and saving the code.
Authentication
Next, run the following command to add a user authentication mechanism to your application. After running the command, Amplify uses AWS CloudFormation to create an Amazon Cognito user pool as the authentication backend. Answer the questions as in the following example to enable users to log in with their usernames and passwords.
Now that you have authentication backend, let’s create the authentication front-end. Run the following command to download the front-end application code and replace the old one. You should see login window as your application preview on the right half of your Cloud9.
Run the following command to replace the existing App.js
file.
Note that this file will be compiled successfully once you create the components directory in the later steps.
Functions
In this step, you create Lambda functions using Amplify so that they can communicate with the Neptune database. Before creating Lambda functions, run the following commands to create a new Lambda layer and import the Gremlin library. For more information, see Create a Lambda Layer.
- Create your Lambda layer with the following code.
- Create a Lambda function named
getGraphData
. - After you add the Lambda function, run the following commands to modify your CloudFormation template and Lambda function code.Amplify uses this template to deploy the setting and environment of Lambda, so you don’t have to change these settings on the console if you change the template setting. You must provide values for the reader endpoint, security group ID, and subnet IDs of your Neptune instance.
- Run the following command to replace the
index.js
file of the Lambda function.The following is the Lambda code of
getGraphData
. This function is invoked by AWS AppSync and uses Gremlin to access Neptune and get the graph information such as vertices and edges. - Now that you’re done setting up the Lambda function
getGraphData
, repeat the same procedure from step 2 to 4, for the other Lambda functions (replace the function names in the code).getInfoNeptune
queryNeptune
registerNeptune
(also replace your-neptune-reader-endpoint with your-neptune-writer-endpoint insed
command in step 3)
- To complete setting up the Lambda functions, run the following command to download
components.zip
to your AWS Cloud9 environment.
GraphQL API
So far, you’ve created a user pool for authentication and backend Lambda functions for querying the Neptune database. In this step, you add a GraphQL API to your Amplify application.
- Run the following command to create a GraphQL API template.
- After you create the template, overwrite the schema of the GraphQL as following and compile it.With
Query
, you can get the search result and information associated with the search word such as institution, product, conference, and so on. WithMutation
, you can register vertex or edge to Neptune. - Upload the change using
amplify push
.
Deploy the application
Run the following command to host the application. Choose Hosting with Amplify Console
and then Manual deployment
.
Finally, publish the application:
After you publish the application, you can use it and experience its operability and quick response.
Clean up
To avoid incurring future charges, clean up the resources you made as part of this post.
- Run the following command to delete the resources created with Amplify.
- Run the following commands to delete the Neptune cluster and instances.
- On the Amazon S3 console, select the bucket that stores
vertex.csv
andedge.csv
. - Choose Empty and then choose Delete.
- On the IAM console, choose the role used for bulk loading (NeptuneLoadFromS3) and choose Delete Role.
- On the Amazon VPC console, choose the VPC endpoint you created and choose Delete endpoint.
- On the AWS Cloud9 console, choose the environment you created and choose Delete.
Summary
In this post, we walked you through how to use Amplify to develop an application that interacts with the graph data in Neptune. We created a Neptune database instance, and then added an authentication mechanism, backend functions, and an API to the application.
To start developing your own application using Neptune and Amplify, see the user guides for Neptune and Amplify.
About the authors
Chiaki Ishio is a Solutions Architect in Japan’s Process Manufacturing and Healthcare Life Sciences team. She is passionate about helping her customers design and build their systems in AWS. Outside of work, she enjoys playing the piano.
Hidenori Koizumi is a Prototyping Solutions Architect in Japan’s Public Sector. He is an expert in developing solutions in the research field based on his scientific background (biology, chemistry, and more). He has recently been developing applications with AWS Amplify or AWS CDK. He likes traveling and photography.