Announcing Amazon Neptune ML: Easy, fast, and accurate predictions on graphs
We’re thrilled to announce the availability of Amazon Neptune ML, an easy, fast, and accurate approach for predictions on graphs. Neptune ML is a new capability that uses graph neural networks (GNNs), a machine learning (ML) technique purpose-built for graphs. With GNNs, you can improve the accuracy of most predictions for graphs by over 50% when compared to making predictions using non-graph methods based on research from Stanford University.
Customers want to gain insights from the relationships in connected datasets for applications such as knowledge graphs, product recommendations, credit checks for loans approvals, or identifying fraudulent users. Developers use graph databases to store connected data and query the graph for insights and patterns. However, for large graphs with billions of relationships, it’s hard to discover these insights using queries based only on human intuition. For this reason, you can use ML on graphs to automatically reveal new insights and make predictions.
Existing ML techniques, such as XGBoost, can’t operate effectively on graphs because they’re designed for tabular data. As a result, using these methods on graphs can take time, require specialized skills from developers, and produce suboptimal predictions. Graph application developers also often lack ML expertise, which can take weeks to learn. Using ML on graphs requires you do a lot of upfront heavy lifting with the graph data to prepare it for ML.
Graph neural networks are an ML approach that is specifically designed to operate on graph data. GNNs use deep neural networks to automatically combine information about a graph’s structure and its features to build ML models that produce accurate predictions. This enables GNNs to achieve state-of-the-art performance for problems such as link prediction, fraud detection, knowledge-graph completion, and product recommendations.
Neptune ML automates the heavy lifting of selecting and training the best ML model for graph data, and lets you access ML on your graph using Neptune APIs and queries. Neptune ML uses the Deep Graph Library (DGL), an open-source library to which AWS contributes that makes it easy to develop and apply GNN models on graph data. As a result, you can now create, train, and apply ML on Neptune data in hours instead of weeks without the need to learn new tools and ML technologies. Now, any developer with data in Neptune can easily use ML on their graphs.
Use cases for Amazon Neptune ML
You can use Neptune ML on any graph. The following are popular use cases for Neptune ML:
- Knowledge graphs – Knowledge graphs consolidate and integrate an organization’s information assets and make them more readily available to all members of the organization. Neptune ML can infer missing links across data sources and identify similar entities to enable better knowledge discovery.
- Product recommendation – Neptune ML can determine which relations are more important for predicting a customer’s purchasing behavior and recommend the list of products a customer would be interested in buying.
- Fraud detection – Companies lose several millions (even billions) of dollars to fraud, and want to detect fraudulent users, accounts, devices, IP address, or credit cards to minimize their loss. You can use a graph-based representation to capture the interaction of the entities (users, devices, or credit cards) and detect patterns when a user initiates multiple mini transactions or uses fraudulent accounts.
- Customer retention and acquisition – You can use Neptune ML to classify customers in the acquisition funnel. You can also use it to recommend product discounts to offer customers to improve retention.
Neptune ML capabilities
Neptune ML supports node property prediction and link prediction capabilities for graph applications.
With property prediction, you can use Neptune ML to predict the value of a property in a graph based on property values of other vertices in the graph. Neptune ML supports two kinds of property prediction:
- Node classification – In node classification, the predicted property takes a value from a finite set. A good example is determining if a card or device used in a banking application is fraudulent (see the following figure). Another example is predicting if a customer will be interested in purchasing a sedan, SUV, or truck.
- Node property regression – In node property regression, the predicted property is numerical. An example here is predicting the rating value for a movie. Another example is predicting the credit score of a user.
For link prediction, we can use Neptune ML to predict missing relationships by ranking the most likely source or target nodes. Neptune ML supports two kinds of link prediction:
- Target node prediction – Given a node and an edge label, we want to rank the top K target nodes. An example is a video streaming app that ranks the list of movies to watch or purchase for a specific user.
- Source node prediction – Given the destination node and an edge label, we want to rank the top K source nodes. For example, we can find the set of users who might be interested in watching a newly added movie.
The following table summarizes Neptune ML capabilities:
|Graph Task||Amazon Neptune ML Capability||Graph Application Examples|
|Property prediction||Node classification||
· Fraud detection
· Identify movie genre
· User preference to buy a SUV, sedan, or truck
|Node property regression||· Predict a movie rating value|
|Link prediction||Source node prediction||· Identify users to notify for a new movie|
|Target node prediction||
· Purchase recommendation for products or games
· Social graph recommendation
Neptune ML is available to all customers using Neptune engine version 188.8.131.52 or later. We’re excited about the launch and look forward to hearing how you’re building graph applications using Neptune ML. To learn more about this feature, see Neptune ML documentation.
About the Authors
George Karypis is a Senior Principal Scientist, AWS Deep Learning. He leads a research group that is developing graph neural network-based models for learning on graphs tasks with applications to knowledge graphs, recommender systems, fraud & abuse, and life sciences.
Karthik Bharathy leads product for Amazon Neptune.