With Amazon Neptune, you can create sophisticated, interactive graph applications that can query billions of relationships in milliseconds. SQL queries for highly connected data are complex and hard to tune for performance. Instead, Amazon Neptune allows you to use the popular graph query languages Apache TinkerPop Gremlin and W3C’s SPARQL and openCypher to execute powerful queries that are easy to write and perform well on connected data. This significantly reduces code complexity, and allows you to more quickly create applications that process relationships.
Amazon Neptune is designed to offer greater than 99.9% availability, increasing database performance and availability by tightly integrating the database engine with an SSD-backed virtualized storage layer purpose-built for database workloads. Neptune's storage is fault-tolerant and self-healing, and disk failures are repaired in the background without loss of database availability. Neptune is designed to automatically detect database crashes and restart without the need for crash recovery or to rebuild the database cache. If the entire instance fails, Neptune will automatically fail over to one of up to 15 read replicas.
You can quickly launch an Amazon Neptune database instance with a few clicks in the Neptune Management Console. Neptune scales storage automatically, growing storage and rebalancing I/Os to provide consistent performance without the need for over-provisioning.
High Performance and Scalability
High Throughput, Low Latency for Graph Queries
Amazon Neptune is a purpose-built, high-performance graph database engine. Neptune efficiently stores and navigates graph data, and uses a scale-up, in-memory optimized architecture to allow for fast query evaluation over large graphs. With Neptune, you can use either Gremlin or SPARQL to execute powerful queries that are easy to write and perform well.
Easy Scaling of Database Compute Resources
With a few clicks in the AWS Management Console, you can scale the compute and memory resources powering your production cluster up or down by creating new replica instances of the desired size, or by removing instances. Compute scaling operations typically complete in a few minutes.
Storage that Automatically Scales
Amazon Neptune will automatically grow the size of your database volume as your database storage needs grow. Your volume will grow in increments of 10 GB up to a maximum of 64 TB. You don't need to provision excess storage for your database to handle future growth.
Low Latency Read Replicas
Increase read throughput to support high volume application requests by creating up to 15 database read replicas. Amazon Neptune replicas share the same underlying storage as the source instance, lowering costs and avoiding the need to perform writes at the replica nodes. This frees up more processing power to serve read requests and reduces the replica lag time – often down to single digit milliseconds. Neptune also provides a single endpoint for read queries so the application can connect without having to keep track of replicas as they are added and removed.
High Availability and Durability
Instance Monitoring and Repair
The health of your Amazon Neptune database and its underlying EC2 instance is continuously monitored. If the instance powering your database fails, the database and associated processes are automatically restarted. Neptune recovery does not require the potentially lengthy replay of database redo logs, so your instance restart times are typically 30 seconds or less. It also isolates the database buffer cache from database processes, allowing the cache to survive a database restart.
Multi-AZ Deployments with Read Replicas
On instance failure, Amazon Neptune automates failover to one of up to 15 Neptune replicas you have created in any of three Availability Zones. If no Neptune replicas have been provisioned, in the case of a failure, Neptune will attempt to create a new database instance for you automatically.
Fault-tolerant and Self-healing Storage
Each 10GB chunk of your database volume is replicated six ways, across three Availability Zones. Amazon Neptune uses fault-tolerant storage that transparently handles the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. Neptune’s storage is also self-healing; data blocks and disks are continuously scanned for errors and replaced automatically.
Automatic, Continuous, Incremental Backups and Point-in-time Restore
Amazon Neptune's backup capability enables point-in-time recovery for your instance. This allows you to restore your database to any second during your retention period, up until the last five minutes. Your automatic backup retention period can be configured up to thirty-five days. Automated backups are stored in Amazon S3, which is designed for 99.999999999% durability. Neptune backups are automatic, incremental, and continuous and have no impact on database performance.
Database Snapshots are user-initiated backups of your instance stored in Amazon S3 that will be kept until you explicitly delete them. They leverage the automated incremental snapshots to reduce the time and storage required. You can create a new instance from a Database Snapshot whenever you desire.
Amazon Neptune Global Database is designed for globally distributed applications, allowing a single Neptune database to span multiple AWS Regions. It replicates the graph data with little impact to database performance, enables fast local reads with low latency in each Region, and provides disaster recovery in case of region-wide outages.
Open Graph APIs
Supports Property Graph Apache TinkerPop Gremlin
Property Graphs are popular because they are familiar to developers that are used to relational models. Gremlin traversal language provides a way to quickly traverse Property Graphs. Amazon Neptune supports the Property Graph model using the open source Apache TinkerPop Gremlin traversal language and provides a Gremlin Websockets server that supports TinkerPop version 3.3. With Neptune, you can quickly build fast Gremlin traversals over property graphs. Existing Gremlin applications can easily use Neptune by changing the Gremlin service configuration to point to a Neptune instance.
Supports W3C’s Resource Description Framework (RDF) 1.1 and SPARQL 1.1
RDF is popular because it provides flexibility for modeling complex information domains. There are a number of existing free or public datasets available in RDF including Wikidata and PubChem, a database of chemical molecules. Amazon Neptune supports the W3C’s Semantic Web standards of RDF 1.1 and SPARQL 1.1 (Query and Update), and provides an HTTP REST endpoint that implements the SPARQL Protocol 1.1. With Neptune, you can easily use the SPARQL endpoint for both existing and new graph applications.
Supports openCypher v9
Neptune supports building graph applications using openCypher, currently one of the most popular query languages for developers working with graph databases. Developers, business analysts, and data scientists like openCypher’s SQL-inspired syntax because it provides a familiar structure to compose queries for graph applications. openCypher and Gremlin query languages can be used together over the same property graph data. Support for openCypher is compatible with the Bolt protocol, to continue to run applications that use the Bolt protocol to connect to Neptune.
Amazon Neptune ML is a new capability of Neptune powered by Amazon SageMaker that uses Graph Neural Networks (GNNs), a machine learning technique purpose-built for graphs, to make easy, fast, and more accurate predictions using graph data. With Neptune ML, you can improve the accuracy of most predictions for graphs by over 50% when compared to making predictions using non-graph methods.
Making accurate predictions on graphs with billions of relationships can be difficult and time consuming. Existing ML approaches such as XGBoost can’t operate effectively on graphs because they are designed for tabular data. As a result, using these methods on graphs can take time, require specialized skills from developers, and produce sub-optimal predictions.
Amazon Neptune runs in Amazon VPC, which allows you to isolate your database in your own virtual network, and connect to your on-premises IT infrastructure using industry-standard encrypted IPsec VPNs. In addition, using Neptune’s VPC configuration, you can configure firewall settings and control network access to your database instances.
Amazon Neptune is integrated with AWS Identity and Access Management (IAM) and provides you the ability to control the actions that your AWS IAM users and groups can take on specific Neptune resources including Database Instances, Database Snapshots, Database Parameter Groups, Database Event Subscriptions, and Database Options Groups. In addition, you can tag your Neptune resources, and control the actions that your IAM users and groups can take on groups of resources that have the same tag (and tag value). For example, you can configure your IAM rules to ensure developers are able to modify "Development" database instances, but only Database Administrators can modify and delete "Production" database instances.
Fine grained access control
Amazon Neptune provides fine grained access to users accessing Neptune data plane APIs with AWS Identity and Access Management (IAM) for performing graph-data actions such as reading, writing, and deleting data from the graph, and non graph-data actions such as starting and monitoring Amazon Neptune ML activities and checking the status of ongoing data plane activities. For example, create a policy with ‘read only’ access for data analysts who do not need to manipulate the graph data, a policy for ‘read and write’ access to developers using the graph for their applications, and a policy for data scientists who need access to Amazon Neptune ML commands.
Amazon Neptune allows you to encrypt your databases using keys you create and control through AWS Key Management Service (KMS). On a database instance running with Neptune encryption, data stored at rest in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster.
Amazon Neptune allows you to log database events with minimal impact on database performance. Logs can later be analyzed for database management, security, governance, regulatory compliance and other purposes. You can also monitor activity by sending audit logs to Amazon CloudWatch.
Easy to Use
Getting started with Amazon Neptune is easy. Just launch a new Neptune database instance using the AWS Management Console. Neptune database instances are pre-configured with parameters and settings appropriate for the database instance class you have selected. You can launch a database instance and connect your application within minutes without additional configuration. Database Parameter Groups provide granular control and fine-tuning of your database.
Easy to Operate
Amazon Neptune makes it easy to operate a high performance graph database. With Neptune, you do not need to create custom indexes over your graph data. Neptune provides timeout and memory usage limitations to reduce the impact of queries that consume too many resources.
Monitoring and Metrics
Amazon Neptune provides Amazon CloudWatch metrics for your database instances. You can use the AWS Management Console to view over 20 key operational metrics for your database instances, including compute, memory, storage, query throughput, and active connections.
Automatic Software Patching
Amazon Neptune will keep your database up-to-date with the latest patches. You can control if and when your instance is patched via Database Engine Version Management.
Database Event Notifications
Amazon Neptune can notify you via email or SMS of important database events like automated failover. You can use the AWS Management Console to subscribe to different database events associated with your Amazon Neptune databases.
Fast Database Cloning
Amazon Neptune supports quick, efficient cloning operations, where entire multi-terabyte database clusters can be cloned in minutes. Cloning is useful for a number of purposes including application development, testing, database updates, and running analytical queries. Immediate availability of data can significantly accelerate your software development and upgrade projects, and make analytics more accurate.
You can clone an Amazon Neptune database with just a few clicks in the Management Console, without impacting the production environment. The clone is distributed and replicated across 3 Availability Zones.
Fast Parallel Bulk Data Loading
Property Graph Bulk Loading
Amazon Neptune supports fast, parallel bulk loading for Property Graph data that is stored in S3. You can use a REST interface to specify the S3 location for the data. It uses a CSV delimited format to load data into the Nodes and Edges. See the Neptune Property Graph bulk loading documentation for more details.
RDF Bulk Loading
Amazon Neptune supports fast, parallel bulk loading for RDF data that is stored in S3. You can use a REST interface to specify the S3 location for the data. The N-Triples (NT), N-Quads (NQ), RDF/XML, and Turtle RDF 1.1 serializations are supported. See the Neptune RDF bulk loading documentation for more details.
Pay Only for What You Use
There is no up-front commitment with Amazon Neptune; you simply pay an hourly charge for each instance that you launch. And, when you’re finished with a Neptune database instance, you can easily delete it. You do not need to over-provision storage as a safety margin, and you only pay for the storage you actually consume. To see more details, visit the Neptune Pricing page.