AWS Database Blog

Use cases and best practices to optimize cost and performance with Amazon Neptune Serverless

In this post, we show you common use cases for Amazon Neptune Serverless, and how you can optimize for both cost and performance by following recommended best practices.

Amazon Neptune is a fully managed database service built for the cloud that makes it easier to build and run graph applications. It supports both RDF and property graph data models. It simplifies database management and enables you to build new or migrate existing applications quickly. Organizations seeking to break free from costly licensing fees can migrate to Neptune and benefit from the latest open-source innovations. It offers a broad selection of features, such as high availability and durability features to customize your workload for the availability you need.

Neptune Serverless is an on-demand auto scaling configuration that is architected to scale your database cluster as needed. With Neptune Serverless, you can use the same features as provisioned instances, with a few limitations. It is cost-effective, and allows you to run and instantly scale graph workloads, without the need to manage and optimize database capacity. With Neptune serverless instances, you can benefit from the following:

  • Instant and non-disruptive capacity scaling
  • Fine-grained and predictable capacity adjustments
  • High availability, disaster recovery, and enhanced capabilities
  • Applications already using Neptune can be easily moved without any changes

A Neptune Serverless cluster scaling configuration is defined in Neptune Capacity Units (NCUs), each of which consists of 2 GB of memory, and associated virtual processor (vCPU) and networking capacity. The lowest minimum NCU that is supported is 1.0, and the highest maximum is 128. You can adjust the NCU configuration in increments of 0.5 NCU. Note that Neptune Serverless only scales the compute capacity. The storage limit remains the same and is not affected by serverless scaling.

Common use cases for Neptune Serverless

Neptune Serverless is best suited for workloads with non-steady, spiky traffic demand. We do not recommend using it for workloads with steady, predictable traffic. The reason for this is that for steady state workloads, provisioned instances can be more cost effective. For example, choosing a provisioned db.r5.xlarge instance is more cost-effective than a serverless instance that remains at NCU=16 by setting the Min=Max NCU=16.

The following are additional use cases for serverless, where it can provide extra benefits such as data isolation and cost-effectiveness.

Resource optimization for new customers/applications

It can be difficult for new customers to determine the infrastructure requirements for their graph workloads. Although we provide a method to estimate the optimal instance sizes for your workload, additional information such as query latency and expected traffic is required, which is often unknown at the initial stage. This makes it hard for new customers and applications to determine their capacity requirements. Instead, you could start with Neptune Serverless without worrying about calculating capacity, and then use the historical pattern to identify a provisioned instance size that meets your requirements. You can also combine serverless instances with provisioned instances in a single cluster if you notice your application is write/read heavy, or has varied traffic.

Development and testing environments

Customers want to explore new ways of reducing cost, especially within their non-production environments. Having development and testing teams deploy large instances and then forget to stop or remove them is a cause for concern for many customers. However, development and testing teams need to deploy instances where they can perform load testing at scale. Serverless can provide the best of both worlds, whereby you can deploy a serverless instance with a low minimum NCU count in order to reduce cost during periods of inactivity, and a high maximum NCU count so development and testing teams can take advantage of the instant vertical scaling during periods of high activity.

Multi-tenant applications with per-tenant databases

For platforms such as Software as a Service (SaaS) applications that service thousands of end-users, customers often over- or under-provision their Neptune database clusters based on the current or expected demand. You may also face data isolation requirements, where you need to separate individual customer data. With Neptune Serverless, not only can you physically isolate each customers’ data by providing them with an individual database cluster, but also you can use the auto scaling features of Neptune Serverless to scale each tenant database independently, without the need to proactively manage the underlying infrastructure.

Several applications backed by databases

In today’s modern application architecture, you can have hundreds or even thousands of custom applications, each of which is supported by one or more databases. As the application requirements evolve, so must the database capacity in order to continue supporting them. Managing this capacity adjustment for a database fleet in a cost-effective manner can be daunting and cumbersome. The serverless offering can help reduce the burden of managing database capacity as your applications and associated capacity requirements evolve.

Efficient horizontal and vertical scaling

Applications that need scalability should split their database across multiple instances for higher throughput. Predicting the capacity of each instance is difficult and inefficient because it requires intricate knowledge of the expected number of requests and the latency for each query. If you create too few instances, you must redistribute data, requiring downtime. If you create too many instances, you pay higher costs because not all instances are equally utilized. By using Serverless instances, you can shard your application to multiple instances without adding much upfront cost, and each of the sharded databases can vertically scale the capacity as and when it is required. There is no more underutilizing, over-provisioning, or paying unnecessary costs.

Disaster recovery with global databases and Neptune Serverless

Neptune Global Database provides the ability to synchronize your graph data between your primary database and up to five secondary Regions. This provides disaster recovery in the case of Region-wide outages. However, in reality, a Region-wide outage is unlikely and therefore any provisioned database instances in the secondary Regions are under-utilized. By using Serverless instances instead of provisioned instances in the secondary Regions, you can save money while they’re not being actively used by setting the minimum NCU to the acceptable lowest value. In turn, when they are required, you can easily update the cluster configuration to increase the NCU values to automatically scale to the new demand.

Recommended practices with Neptune Serverless

In this section, we share some best practices when using Neptune Serverless.

Configure the minimum and maximum NCUs best suited for your workloads and features

Consider the following:

  • Workloads that use the Bulk Load API – It is recommended to configure a Neptune Serverless database appropriately for resource-intensive workloads such as bulk loads. You should set the maximum NCU to a level that allows you to ingest data at the desired rate. For example, setting it to 128 NCUs during bulk load operations will allow Neptune the capacity it needs for bulk ingestion. The exact maximum NCU will depend on the amount of data being ingested and the desired data ingestion rate. Also, using a configuration where min=max NCU is an anti-pattern for this type of use case as it doesn’t facilitate scaling. A provisioned database instance of a similar size (memory size and number of vCPUs) could be a better choice in this case.
    When using a serverless reader instance, combined with a large provisioned primary instance, there are situations where the reader may not be able to keep up with the number of writes being made, and it performs a restart to read from shared storage. In this case, you should either scale up the serverless reader minimum NCU configuration to a value of greater than 1 NCU to meet replication traffic from the primary, or temporarily remove it during the bulk load operation. For more information, refer to Avoid repeated restarts during bulk loading.
  • Database clusters with IAM database authentication enabled – Database clusters with IAM database authentication enabled require a minimum NCU value of 2, or higher to avoid errors.

Switch workloads with spiky traffic to serverless instances

For workloads that have spiky, inconsistent demand, using a serverless instance provides an automated, cost-effective mechanism to vertically scale the compute resource associated with a Neptune database cluster. However, for workloads where traffic is consistent, using a serverless instance will be cost-inefficient compared to using a provisioned instance of the same capacity. Also, setting min=max NCU in a serverless cluster is against the advised pattern. Depending on your requirements, you may find provisioned instances meet your needs, which is more cost-effective and performant.

Increase the minimum capacity to enable faster scaling

Neptune Serverless scales at a rate based on the configuration value of the minimum NCU. If the minimum NCU is set to 1.0, it will take longer for your serverless database cluster to scale up to provide capacity for heavy traffic. This is because Neptune Serverless chooses a scaling increment based on the currently used serverless capacity. By setting this to a higher value, especially prior to events where you know there will be increased demand on your database, Neptune Serverless will use a higher scaling rate, and will scale up faster to meet the demand.

Set the correct priority tier of the reader instance

If a Neptune Serverless reader instance isn’t scaling down to the minimum NCU and is staying at the same or higher capacity than the primary instance, check the priority tier of the reader instance. In tier 0 or 1, serverless reader database instances are kept at the capacity that matches the writer database instances. Change the reader’s priority tier to 2 or higher so that it scales up and down independently of the writer. For more information, see Setting the promotion tiers for Neptune Serverless instances.

Combine Neptune Serverless with auto-scaling

You can use Neptune read replica auto scaling to automatically adjust the number of Neptune replicas in a database cluster based on your connectivity requirements and workload. You can also use Neptune Serverless to handle sudden spikes in traffic from your application. When CPU utilization-based auto scaling is used, setting a low NCU for both minimum and maximum values, or implementing a narrow range between minimum and maximum NCUs, may cause the rapid addition and removal of Neptune read replicas.

Set the query timeout correctly to avoid running queries in background and costing money

The cost of running Neptune Serverless for long running workloads compared with provisional instances will be higher, so it’s recommended to adjust the query timeout parameter to cater for your specific requirements. Where only short-lived queries are expected, set a low query timeout. This is to avoid incurring unnecessary cost due to unexpected long running queries that are running in the background.

Monitoring serverless instances for cost and optimization

To monitor your serverless database cluster or instance, there are two additional Amazon CloudWatch metrics for serverless instances that provide information:

  • ServerlessDatabaseCapacity – Provides the current instance capacity (at the instance-level) or an average of all values across all instances (at the cluster-level)
  • NCUUtilization – Reports the percentage of possible capacity being used by dividing ServerlessDatabaseCapacity by the maximum NCU value.

If you’re seeing the NCUUtilization metric approach 100%, consider increasing the maximum NCU value across your serverless instances.

Serverless pricing

When you deploy a serverless instance, the same factors as provisioned instances apply to pricing, for example:

  • Storage billed in per GB-month
  • I/O operations consumed
  • Backup storage
  • Data transfer charges
  • Charges for special features, such as global databases and snapshot exports

The primary difference in charging for serverless instances is they are priced based on your usage in NCU per hour. The actual cost will depend on the Region your Neptune database cluster has been deployed in.

Neptune Serverless pricing comparison

The following is a cost comparison between using a serverless instance and a db.r6g.8xlarge provisioned instance for workloads with spiky and consistent traffic, across a 30-day period. Both estimations include the following configuration:

  • Region is us-east-1
  • 50 GB of data with 100 GB backup
  • 200 million I/O operations per month
  • Data transfer in of 50 GB per month
  • Data transfer out of 10 GB per month

The following table compares an example workload with spiky traffic, 1 hour per day at maximum NCU (128) and 23 hours per day at minimum NCU (1).

. Serverless Provisioned (db.r6g.8xlarge)
Instance charges $728.42 $3,804.62
Storage charges $7.10 $7.10
I/O charges $40.00 $40.00
Data transfer charges $0.00 $0.00
Total charges $775.52 $3,851.72

The difference in cost using serverless compared to provisioned instances for the spiky workload is a savings of $3,076.20 per month.

The following table compares an example workload of consistent traffic, 23 hours per day at maximum NCU (128) and 1 hour per day at minimum NCU (1).

. Serverless Provisioned (db.r6g.8xlarge)
Instance charges $14,206.80 $3,804.62
Storage charges $7.10 $7.10
I/O charges $40.00 $40.00
Data transfer charges $0.00 $0.00
Total charges $14,253.90 $3,851.72

The difference in cost using serverless compared to provisioned instances for a consistent workload is an increase of $10,402.18 per month.

Serverless constraints

Although Neptune Serverless can provide instant vertical scalability and cost-efficiency, it’s important to understand its limitations:

Refer to Amazon Neptune Serverless constraints for more information.


In this post, we discussed the common use cases and best practices for Neptune Serverless. You can benefit from using serverless database instances over provisioned for workloads with varied traffic or data isolation requirements, or to provide cost-effectiveness in non-production environments. In addition, we discussed how combining serverless instances with traditional provisioned instances within the same database cluster can provide flexibility and scalability for some of the most demanding workloads.

To get started with Neptune Serverless, visit the Neptune console, or refer to Amazon Neptune Serverless to find out more information. Leave your questions in the comments.

About the Authors

Kevin PhillipsKevin Phillips is a Neptune Specialist Solutions Architect working in the UK at Amazon Web Services. He has 18 years of development and solutions architectural experience, which he uses to help support and guide customers.

Ankit GuptaAnkit Gupta is a Software Development Manager with the Amazon Neptune Platform Team in India and has been part of the Neptune team since product inception. He works with AWS customers and internal development teams to improve Neptune’s usability, performance, scalability, and user experience.