With Amazon Neptune, you can create sophisticated, interactive graph applications that can query billions of relationships in milliseconds. SQL queries for highly connected data are complex and hard to tune for performance. Instead, Amazon Neptune allows you to use the popular graph query languages Apache TinkerPop Gremlin and W3C’s SPARQL and openCypher to run powerful queries that are easy to write and perform well on connected data. This significantly reduces code complexity, and allows you to more quickly create applications that process relationships.
Neptune has a SLA of 99.9%, increasing database performance and availability by tightly integrating the database engine with an SSD-backed virtualized storage layer purpose-built for database workloads. Neptune's storage is fault-tolerant and self-healing, and disk failures are repaired in the background without loss of database availability. Neptune is designed to automatically detect database crashes and restart without the need for crash recovery or to rebuild the database cache. If the entire instance fails, Neptune will automatically fail over to one of up to 15 read replicas.
You can quickly launch a Neptune database instance with a few steps in the Neptune console. Neptune scales storage automatically, growing storage and rebalancing I/Os to provide consistent performance without the need for overprovisioning.
High performance and scalability
Amazon Neptune serverless is an on-demand deployment option that that automatically adjusts database capacity based on an application’s needs. Neptune Serverless can scale graph database workloads instantly to hundreds of thousands of queries. Neptune Serverless adjusts capacity to provide just the right amount of database resources that the application needs, and you pay only for the consumed capacity, saving up to 90% in database costs compared to peak capacity.
High throughput, low latency for graph queries
Neptune is a purpose-built, high-performance graph database engine. Neptune efficiently stores and navigates graph data, and uses a scale-up, in-memory optimized architecture to allow for fast query evaluation over large graphs. With Neptune, you can use either Gremlin, openCypher or SPARQL to run powerful queries that are easy to write and perform well.
Easy scaling of database compute resources
With a few steps in the AWS Management Console, you can scale the compute and memory resources powering your production cluster up or down by creating new replica instances of the desired size, or by removing instances. Compute scaling operations typically complete in a few minutes.
Storage that automatically scales
Neptune uses a distributed and shared storage architecture that will automatically grow as your database storage needs grow. Neptune data is stored in a cluster volume that has multi-AZ high availability. When a Neptune DB cluster is created, it is allocated a single segment of 10 GB. As the volume of data increases and exceeds the currently allocated storage, Neptune automatically expands the cluster volume by adding new segments. A Neptune cluster volume can grow to a maximum size of 128 tebibytes (TiB) in supported Regions except China and GovCloud. You don't need to provision excess storage for your database to handle future growth.
Low latency read replicas
Increase read throughput to support high volume application requests by creating up to 15 database read replicas. Neptune replicas share the same underlying storage as the source instance, lowering costs and avoiding the need to perform writes at the replica nodes. This frees up more processing power to serve read requests and reduces the replica lag time — often down to single digit milliseconds. Neptune also provides a single endpoint for read queries so the application can connect without having to keep track of replicas as they are added and removed.
High availability and durability
Instance monitoring and repair
The health of your Neptune database and its underlying EC2 instance is continually monitored. If the instance powering your database fails, the database and associated processes are automatically restarted. Neptune recovery does not require the potentially lengthy replay of database redo logs, so your instance restart times are typically 30 seconds or less. It also isolates the database buffer cache from database processes, allowing the cache to survive a database restart.
Multi-AZ deployments with read replicas
On instance failure, Neptune automates failover to one of up to 15 Neptune replicas you have created in any of three Availability Zones. If no Neptune replicas have been provisioned, in the case of a failure, Neptune will attempt to create a new database instance for you automatically.
Fault-tolerant and self-healing storage
Each 10 GB chunk of your database volume is replicated six ways across three Availability Zones. Neptune uses fault-tolerant storage that transparently handles the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. Neptune storage is also self-healing—data blocks and disks are continually scanned for errors and replaced automatically.
Automatic, continuous, incremental backups and point-in-time restore
Neptune's backup capability enables point-in-time recovery for your instance. This allows you to restore your database to any second during your retention period, up until the last five minutes. Your automatic backup retention period can be configured up to 35 days. Automated backups are stored in Amazon S3, which is designed for 99.999999999% durability. Neptune backups are automatic, incremental, and continual and have no impact on database performance.
Database snapshots are user-initiated backups of your instance stored in Amazon S3 that will be kept until you explicitly delete them. They use the automated incremental snapshots to reduce the time and storage required. You can create a new instance from a database snapshot whenever you desire.
Amazon Neptune Global Database is designed for globally distributed applications, allowing a single Neptune database to span multiple AWS Regions. It replicates the graph data with little impact to database performance, enables fast local reads with low latency in each Region, and provides disaster recovery in case of region-wide outages.
Open graph APIs
Supports Apache TinkerPop Gremlin for property graph
Property Graphs are popular because they are familiar to developers that are used to relational models. Gremlin traversal language provides a way to quickly traverse Property Graphs. Amazon Neptune supports the Property Graph model using the open source Apache TinkerPop Gremlin traversal language and provides a Gremlin Websockets server that supports TinkerPop version 3.3. With Neptune, you can quickly build fast Gremlin traversals over property graphs. Existing Gremlin applications can easily use Neptune by changing the Gremlin service configuration to point to a Neptune instance.
Supports W3C’s Resource Description Framework (RDF) 1.1 and SPARQL 1.1
RDF is popular because it provides flexibility for modeling complex information domains. There are a number of existing free or public datasets available in RDF including Wikidata and PubChem, a database of chemical molecules. Amazon Neptune supports the W3C’s Semantic Web standards of RDF 1.1 and SPARQL 1.1 (Query and Update), and provides an HTTP REST endpoint that implements the SPARQL Protocol 1.1. With Neptune, you can easily use the SPARQL endpoint for both existing and new graph applications.
Supports openCypher v9 for property graph
Neptune supports building graph applications using openCypher, currently one of the most popular query languages for developers working with graph databases. Developers, business analysts, and data scientists like openCypher’s SQL-inspired syntax because it provides a familiar structure to compose queries for graph applications. openCypher and Gremlin query languages can be used together over the same property graph data. Support for openCypher is compatible with the Bolt protocol, to continue to run applications that use the Bolt protocol to connect to Neptune
Amazon Neptune machine learning (ML) is a new capability of Neptune powered by Amazon SageMaker that uses Graph Neural Networks (GNNs), an ML technique purpose-built for graphs, to make easy, fast, and more accurate predictions using graph data. With Neptune ML, you can improve the accuracy of most predictions for graphs by over 50% when compared to making predictions using non-graph methods.
Making accurate predictions on graphs with billions of relationships can be difficult and time consuming. Existing ML approaches such as XGBoost can’t operate effectively on graphs because they are designed for tabular data. As a result, using these methods on graphs can take time, require specialized skills from developers, and produce sub-optimal predictions.
Neptune runs in Amazon VPC, which allows you to isolate your database in your own virtual network, and connect to your on-premises IT infrastructure using industry-standard encrypted IPsec VPNs. In addition, using the Neptune VPC configuration, you can configure firewall settings and control network access to your database instances.
Amazon Neptune is integrated with AWS Identity and Access Management (IAM) and provides you the ability to control the actions that your AWS IAM users and groups can take on specific Neptune resources including Database Instances, Database Snapshots, Database Parameter Groups, Database Event Subscriptions, and Database Options Groups. In addition, you can tag your Neptune resources, and control the actions that your IAM users and groups can take on groups of resources that have the same tag (and tag value). For example, you can configure your IAM rules to ensure developers are able to modify "Development" database instances, but only database administrators can modify and delete "Production" database instances.
Fine-grained access control
Neptune provides fine-grained access to users retrieving Neptune data plane APIs with AWS Identity and Access Management (IAM) for performing graph-data actions such as reading, writing, and deleting data from the graph, and non-graph-data actions such as starting and monitoring Amazon Neptune ML activities and checking the status of ongoing data plane activities. For example, create a policy with ‘read only’ access for data analysts who do not need to manipulate the graph data, a policy for ‘read and write’ access to developers using the graph for their applications, and a policy for data scientists who need access to Neptune ML commands.
Neptune supports encryption in transit with TLS version 1.2. Neptune allows you to encrypt your databases using keys you create and control through AWS Key Management Service (KMS). On a database instance running with Neptune encryption, data stored at rest in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster.
Amazon Neptune allows you to log database events with minimal impact on database performance. Logs can later be analyzed for database management, security, governance, regulatory compliance and other purposes. You can also monitor activity by sending audit logs to Amazon CloudWatch.
Easier to use
You can get started with Neptune by launching a new Neptune database instance using the AWS Management Console. Neptune database instances are pre-configured with parameters and settings appropriate for the database instance class you have selected. You can launch a database instance and connect your application within minutes without additional configuration. Database Parameter Groups provide granular control and fine-tuning of your database.
Easier to operate
Neptune makes it easier to operate a high performance graph database. With Neptune, you do not need to create custom indexes over your graph data. Neptune provides timeout and memory usage limitations to reduce the impact of queries that consume too many resources.
Monitoring and metrics
Neptune provides Amazon CloudWatch metrics for your database instances. You can use the AWS Management Console to view over 20 key operational metrics for your database instances, including compute, memory, storage, query throughput, and active connections.
Automatic software patching
Neptune will keep your database up-to-date with the latest patches. You can control if and when your instance is patched through Database Engine Version Management.
Database event notifications
Neptune can notify you through email or SMS of important database events like automated failover. You can use the AWS Management Console to subscribe to different database events associated with your Amazon Neptune databases.
Fast database cloning
Neptune supports quick, efficient cloning operations, where entire multi-terabyte database clusters can be cloned in minutes. Cloning is useful for a number of purposes including application development, testing, database updates, and running analytical queries. Immediate availability of data can significantly accelerate your software development and upgrade projects, and make analytics more accurate.
You can clone a Neptune database with just a few steps in the AWS Management Console, without impacting the production environment. The clone is distributed and replicated across three Availability Zones.
Fast bulk data loading
Property graph bulk loading
Neptune supports fast, parallel bulk loading for Property Graph data that is stored in S3. You can use a REST interface to specify the S3 location for the data. It uses a CSV delimited format to load data into the Nodes and Edges. See the Neptune Property Graph bulk loading documentation for more details.
RDF bulk loading
Neptune supports fast, parallel bulk loading for RDF data that is stored in S3. You can use a REST interface to specify the S3 location for the data. The N-Triples (NT), N-Quads (NQ), RDF/XML, and Turtle RDF 1.1 serializations are supported. See the Neptune RDF bulk loading documentation for more details.
Broad compliance program coverage
Neptune is in-scope for over 20 international compliance standards ranging from FedRAMP (Moderate and High) to SOC (1,2,3), and is also HIPAA eligible. The full list of standards that Neptune is compliant with can be found in the AWS Services in Scope by Compliance Program.
Pay only for what you use
There is no up-front commitment with Neptune; you pay an hourly charge for each instance that you launch, or the database resources you consume for serverless. When you’re finished with a Neptune database instance, you can delete it. You do not need to overprovision storage as a safety margin, and you only pay for the storage you actually consume. To see more details, visit the Neptune Pricing page.