AWS Database Blog

Use AWS Lambda functions with Amazon Neptune

Many Amazon Neptune connected data applications for knowledge graphs, identity graphs, and fraud graphs use AWS Lambda functions to query Neptune. This post provides general connection management, error handling, and workload balancing guidance for using any of the popular Gremlin drivers and language variants to connect to Neptune from a Lambda function.

The connection management guidance here applies primarily to applications that use Gremlin drivers with long-lived WebSocket connections to connect to Neptune. The recommended way of querying Neptune from a Lambda function that uses a Gremlin driver has changed with recent engine releases, from opening and closing a WebSocket connection per Lambda invocation, to using a single connection for the duration of the function’s execution context. This post explains the reason for the change, illustrated with specific examples of Lambda functions written in Java, JavaScript, and Python.

The error handling and workload balancing guidance in this post applies not only to Lambda functions that use Gremlin drivers, but also to functions that connect to a Neptune Gremlin or SPARQL endpoint over HTTP.


In this section, we discuss the Lambda function lifecycle, Gremlin WebSocket connections, and Neptune connections.

Lambda function lifecycle and Gremlin WebSocket connections

If you use a Gremlin driver and a Gremlin language variant to query Neptune, the driver connects to the database using a WebSocket connection. WebSockets are designed to support long-lived client-server connection scenarios. Lambda, on the other hand, is designed to support short-lived and stateless runs.

A Lambda function runs in an execution context. This execution context isolates the function from other functions, and is created the first time the function is invoked. After an execution context has been created for a function, Lambda can reuse it for subsequent invocations of the same function.

Although a single execution context can handle multiple invocations of a function, it can’t handle concurrent invocations of the function. If your function is invoked simultaneously by multiple clients, Lambda spins up additional execution contexts to host new instances of the function. Each of these new contexts may in turn be reused for subsequent invocations of the function. At some point, Lambda recycles a context—particularly if it has been inactive for some time.

A common best practice when using Lambda to query a database is to open the database connection outside the Lambda handler function so it can be reused with each handler call. If the database connection drops at some point, you can reconnect from inside the handler. But there is a danger of connection leaks with this approach. If an idle connection stays open much longer after an execution context is destroyed, intermittent Lambda invocation scenarios can gradually leak connections, thereby exhausting database resources.

Neptune connections

Neptune’s connection limits and connection timeouts have changed with engine releases. With early engine releases, every instance supported up to 60,000 WebSocket connections. This has changed so that now the maximum number of concurrent WebSocket connections per Neptune instance is different for each instance type. Furthermore, with engine version, Neptune reduced the idle timeout for connections, from 1 hour down to approximately 20 minutes. If a client doesn’t close a connection, the connection is closed automatically after an idle timeout of 20–25 minutes. Lambda doesn’t document execution context lifetimes, but experiments have shown that the new Neptune connection timeout aligns well with inactive Lambda execution context timeouts: by the time an inactive context is recycled, there’s a good chance its connection has already been closed by Neptune, or will be closed soon after.


In this section, we provide recommendations on connections, read and write requests, cold starts, and Lambda extensions.

Using a single connection for the lifetime of an execution context

Use a single connection and graph traversal source for the entire lifetime of the Lambda execution context rather than per function invocation. Each function invocation handles a single client request. Concurrent client requests are handled by different function instances running in separate execution contexts. Because an execution context only ever services a single request at a time, there’s no need to maintain a pool of connections to handle concurrent requests inside a function instance. If the Gremlin driver you’re using has a connection pool, configure it to use a single connection.

Handling connection issues and retrying connections if necessary

Use retry logic around the query to handle connection failures. Although the goal is to maintain a single connection for the lifetime of an execution context, unexpected network events can cause this connection to be stopped abruptly. Connection failures manifest as different errors depending on the driver you’re using. You should code your function to handle these connection issues and attempt a reconnection if necessary.

Some Gremlin drivers automatically handle reconnections; others require you to build your own reconnection logic. The Java driver, for example, automatically attempts to re-establish connectivity to Neptune on behalf of your client code. With this driver, your function code needs only to back off and retry the query. The JavaScript and Python drivers, in contrast, don’t implement any automatic reconnection logic. With these drivers, your function code has to back off and attempt to reconnect before retrying the query. The code examples in this document include appropriate reconnection logic.

Considerations for write requests

If your Lambda function modifies data in Neptune, you should consider adopting a backoff-and-retry strategy to handle the following exceptions:

  • ConcurrentModificationException – The Neptune transaction semantics mean that write requests can sometimes fail with a ConcurrentModificationException. To handle these situations, consider implementing an exponential backoff-based retry mechanism.
  • ReadOnlyViolationException – Because the cluster topology can change at any moment as a result of both planned and unplanned cluster events, write responsibilities may migrate from one instance in the cluster to another. If your function code attempts to send a write request to an instance that is no longer the primary, the request fails with a ReadOnlyViolationException. When this happens, your code should close the existing connection, reconnect to the cluster endpoint, and retry the request.

If you use a backoff-and-retry strategy to handle write request issues, consider implementing idempotent queries for create and update requests (using, for example, fold().coalesce().unfold()).

Considerations for read requests

If you have multiple read replicas in your cluster, you likely want to balance read requests across these replicas. One option is to use the reader endpoint. The reader endpoint distributes connections across replicas even if the cluster topology changes as a result of you adding or removing replicas, or promoting a replica to become the new primary.

However, in some circumstances, using the reader endpoint can result in an uneven use of cluster resources. The reader endpoint works by periodically changing the host to which the DNS entry points. If a client opens a lot of connections before the DNS entry changes, all the connection requests are sent to a single Neptune instance. This can be the case with a high throughput Lambda scenario: a large number of concurrent requests to your Lambda function causes multiple execution contexts to be created, each with its own connection. If those connections are all created almost simultaneously, the majority will likely point to the same replica in the cluster, and will stay pointing to that replica until Lambda recycles the execution contexts.

One way you can distribute requests across instances is to configure your Lambda function to connect to an instance endpoint, chosen at random from a list of replica instance endpoints, rather than the reader endpoint. The downside of this approach is that it requires the Lambda code to handle changes in the cluster topology by monitoring the cluster and updating the endpoint list whenever the membership of the cluster changes.

If you’re writing a Java Lambda function that needs to balance read requests across instances in your cluster, you can use the Gremlin client for Amazon Neptune, a Java Gremlin client that is aware of your cluster topology, and which fairly distributes connections and requests across a set of instances in a Neptune cluster. For a sample Java Lambda function that uses the Gremlin client for Amazon Neptune, see Load balance graph queries using the Amazon Neptune Gremlin Client.

Cold starts

Java code compilation can be slower in a Lambda function than on an Amazon Elastic Compute Cloud (Amazon EC2) instance. CPU cycles in a Lambda function scale with the amount of memory assigned to the function. Lambda allocates CPU power linearly in proportion to the amount of memory configured. At 1,792 MB, a function has the equivalent of one full vCPU (one vCPU-second of credits per second). The impact of the relative lack of CPU cycles in low-memory Lambda functions is particularly pronounced with large Java functions. Consider assigning more memory to your function to increase the CPU power available for compiling and running your code.

AWS Identity and Access Management (IAM) database authentication can affect cold starts, particularly if the function has to generate a new signing key. This is less of an issue after the first request because after the IAM database authentication is used to establish a WebSocket connection, it’s only periodically used to check that the connections’ credentials are still valid. (If the server closes the connection for any reason, including because the IAM credentials are now stale, you should ensure the Lambda opens a new connection with refreshed credentials.)

Lambda extensions

Lambda exposes the execution context lifecycle as Init, Invoke, and Shutdown phases. You can use Lambda extensions to write code that cleans up external resources, such as database connections, when an execution context is recycled. The example functions later in this post don’t use the execution lifecycle extensions—testing has shown that the function implementations conserve database resources without using the extensions—but the lifecycle phases do provide for additional control over connection lifetimes should you wish to take advantage of them.


The following example Lambda functions, written in Java, JavaScript, and Python, illustrate upserting a single vertex with a randomly generated ID into Neptune using the fold().coalesce().unfold() idiom.

Much of the code in each function is boilerplate code, responsible for managing the connection to Neptune and retrying the connection and query if an error occurs. The real application logic is implemented in doQuery(), and the query itself in the query() method. If you use these examples as the basis of your own Lambda functions, concentrate on modifying the doQuery() and query() methods.

The functions are configured to retry failed queries five times, waiting 1 second between retries.

The functions expect values for a number of Lambda environment variables:

  • NEPTUNE_ENDPOINT – The Neptune cluster endpoint.
  • NEPTUNE_PORT – The Neptune port.
  • USE_IAM – Can be true or false. If your database has IAM database authentication enabled, supply a value of true. The Lambda then Sigv4 signs connection requests to Neptune. (If the server closes the connection because the IAM credentials are stale, the function opens a new connection with refreshed credentials.) For IAM database authentication requests, ensure the Lambda function’s execution role has an appropriate IAM policy that allows the function to connect to your Neptune DB cluster using IAM database authentication. Ensure also that the function is running in a subnet with access to Neptune, and that the Neptune VPC security group allows ingress (on 8182) from the Lambda function’s security group.


The Java driver by default maintains a pool of connections. Configure Cluster with minConnectionPoolSize(1) and maxConnectionPoolSize(1) so that the driver opens only a single connection.

The Cluster object can be slow to build because it creates one or more serializers (Gyro by default, plus another if you’ve configured it for serialization other than Gyro), which take a while to instantiate. (The unnecessary creation of a Gyro serializer is removed in 3.4.9.)

The connection pool is initialized with the first request. At this point, the driver sets up the Netty stack, allocates byte buffers, and creates a signing key (if using IAM DB authentication), all of which can add to latency.

The Java driver’s connection pool monitors the availability of server hosts. If a connection fails, the driver automatically attempts to reconnect to the database using a background task. You can use reconnectInterval() to configure the interval between reconnection attempts. While the driver is attempting to reconnect, your Lambda function can simply retry the query. (If the interval between retries is smaller than the interval between reconnect attempts, retries on a failed connection fail again because the host is still considered unavailable.)

Use Java 8 rather than Java 11; Netty optimizations are not enabled by default in Java 11.

This example uses Retry4j for retries.

To use the Sigv4 signing driver in your Java Lambda function, see the dependency requirements in Connecting to Neptune Using Java and Gremlin with Signature Version 4 Signing.

The following is the Lambda function code in Java:


import com.evanlennick.retry4j.CallExecutor;
import com.evanlennick.retry4j.CallExecutorBuilder;
import com.evanlennick.retry4j.Status;
import com.evanlennick.retry4j.config.RetryConfig;
import com.evanlennick.retry4j.config.RetryConfigBuilder;
import org.apache.tinkerpop.gremlin.driver.Cluster;
import org.apache.tinkerpop.gremlin.driver.SigV4WebSocketChannelizer;
import org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection;
import org.apache.tinkerpop.gremlin.driver.ser.Serializers;
import org.apache.tinkerpop.gremlin.process.traversal.AnonymousTraversalSource;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.T;

import java.time.temporal.ChronoUnit;
import java.util.HashMap;
import java.util.Map;
import java.util.Random;
import java.util.concurrent.Callable;
import java.util.function.Function;

import static java.nio.charset.StandardCharsets.UTF_8;
import static org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.__.addV;
import static org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.__.unfold;

public class MyHandler implements RequestStreamHandler {

    private final GraphTraversalSource g;
    private final CallExecutor<Object> executor;
    private final Random idGenerator = new Random();

    public MyHandler() {

        this.g = AnonymousTraversalSource

        this.executor = new CallExecutorBuilder<Object>()


    public void handleRequest(InputStream input,
                              OutputStream output,
                              Context context) throws IOException {

        doQuery(input, output);

    private void doQuery(InputStream input, OutputStream output) throws IOException {
        try {

            Map<String, Object> args = new HashMap<>();
            args.put("id", idGenerator.nextInt());

            String result = query(args);

            try (Writer writer = new BufferedWriter(new OutputStreamWriter(output, UTF_8))) {

        } finally {

    private String query(Map<String, Object> args) {

        int id = (int) args.get("id");

        Callable<Object> query = () -> g.V(id)
                        addV("Person").property(, id))

        Status<Object> status = executor.execute(query);

        return status.getResult().toString();

    private Cluster createCluster() {
        Cluster.Builder builder =

        if (Boolean.parseBoolean(getOptionalEnv("USE_IAM", "true"))) {
            builder = builder.channelizer(SigV4WebSocketChannelizer.class);

        return builder.create();

    private RetryConfig createRetryConfig() {
        return new RetryConfigBuilder()
                .withDelayBetweenTries(1000, ChronoUnit.MILLIS)

    private Function<Exception, Boolean> retryLogic() {

        return e -> {

            StringWriter stringWriter = new StringWriter();
            e.printStackTrace(new PrintWriter(stringWriter));
            String message = stringWriter.toString();

            // Check for connection issues
            if (message.contains("Timed out while waiting for an available host") ||
                    message.contains("Timed-out waiting for connection on Host") ||
                    message.contains("Connection to server is no longer active") ||
                    message.contains("Connection reset by peer") ||
                    message.contains("SSLEngine closed already") ||
                    message.contains("Pool is shutdown") ||
                    message.contains("ExtendedClosedChannelException") ||
                    message.contains("Broken pipe")) {
                return true;

            // Concurrent writes can sometimes trigger a ConcurrentModificationException.
            // In these circumstances you may want to backoff and retry.
            if (message.contains("ConcurrentModificationException")) {
                return true;

            // If the primary fails over to a new instance, existing connections to the old primary will
            // throw a ReadOnlyViolationException. You may want to back and retry.
            if (message.contains("ReadOnlyViolationException")) {
                return true;

            return false;


    private String getOptionalEnv(String name, String defaultValue) {
        String value = System.getenv(name);
        if (value != null && value.length() > 0) {
            return value;
        } else {
            return defaultValue;


The JavaScript driver doesn’t maintain a connection pool: it always opens a single connection. The Lambda function uses the Sigv4 signing utilities from gremlin-aws-sigv4 for signing requests to an IAM database authentication enabled database, and the retry function from the async module to handle backoff-and-retry attempts. Terminal steps return a promise. For next(), this is a {value, done} tuple.

Connection errors are raised inside the handler, and dealt with using some backoff-and-retry logic in line with the recommendations outlined in this article, with one exception. There is one kind of connection issue that the driver does not treat as an exception, and which cannot therefore be accommodated by this backoff-and-retry logic.

The problem is that if a connection is closed after a driver sends a request, but before the driver receives a response, the query appears to complete, but with a null return value. As far as the Lambda function’s client is concerned, the function appears to complete successfully, but with an empty response.

The impact of this issue depends on how your application treats an empty response. Some applications may treat an empty response from a read request as an error, but others may mistakenly treat this as an empty result. Write requests too that encounter this connection issue will return an empty response. Does a successful invocation with an empty response signal success or failure? If the client invoking a write function simply treats the successful invocation of the function to mean the write to the database has been committed, rather than inspecting the body of the response, the system may appear to lose data.

The cause of this issue is in how the driver treats events emitted by the underlying socket. When the underlying network socket is closed with an ECONNRESET error, the Websocket used by the driver is closed and emits a ‘ws close’ event. There’s nothing in the driver, however, to handle this event in a way that could be used to provoke an exception. As a result, the query ‘disappears’.

To work around this issue, the Lambda function shown here adds a ‘ws close’ event handler that throws an exception to the driver when creating a remote connection. This exception won’t, however, be raised along the Gremlin query’s request-response path, and can’t therefore be used to trigger any backoff-and-retry logic within the Lambda function itself. Instead, the exception thrown by the ‘ws close’ event handler results in an unhandled exception that causes the Lambda invocation to fail. This allows the client that invokes the function to handle the error and retry the Lambda invocation if appropriate.

This article recommends that you implement backoff-and-retry logic in your Lambda function to protect your clients from intermittent connection issues. The workaround for this issue stands outside these recommendations in that it requires the client to also implement some retry logic to handle functions that fail because of this particular connection issue.

See the following code:

const gremlin = require('gremlin');
const async = require('async');
const {getUrlAndHeaders} = require('gremlin-aws-sigv4/lib/utils');

const traversal = gremlin.process.AnonymousTraversalSource.traversal;
const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
const t = gremlin.process.t;
const __ = gremlin.process.statics;

let conn = null;
let g = null;

async function query(context) {
    const id =;
    return g.V(id)
            __.addV('User').property(, id)

async function doQuery() {
    const id = Math.floor(Math.random() * 10000).toString();
    let result = await query({id: id}); 
    return result['value'];

exports.handler = async (event, context) => {

    const getConnectionDetails = () => {
        if (process.env['USE_IAM'] == 'true'){
           return getUrlAndHeaders(
        } else {
            const database_url = 'wss://' + process.env['NEPTUNE_ENDPOINT'] + ':' + process.env['NEPTUNE_PORT'] + '/gremlin';
            return { url: database_url, headers: {}};
    const createRemoteConnection = () => {
        const { url, headers } = getConnectionDetails();
        const c = new DriverRemoteConnection(
                mimeType: 'application/vnd.gremlin-v2.0+json', 
                headers: headers 

         c._client._connection.on('close', (code, message) => {
       `close - ${code} ${message}`);
                 if (code == 1006){
                     console.error('Connection closed prematurely');
                     throw new Error('Connection closed prematurely');
         return c;       
    const createGraphTraversalSource = (conn) => {
        return traversal().withRemote(conn);
    if (conn == null){"Initializing connection")
        conn = createRemoteConnection();
        g = createGraphTraversalSource(conn);
    return async.retry(
            times: 5,
            interval: 1000,
            errorFilter: function (err) { 
                // Add filters here to determine whether error can be retried
                console.warn('Determining whether retriable error: ' + err.message);
                // Check for connection issues
                if (err.message.startsWith('WebSocket is not open')){
                    console.warn('Reopening connection');
                    conn = createRemoteConnection();
                    g = createGraphTraversalSource(conn);
                    return true;
                // Check for ConcurrentModificationException
                if (err.message.includes('ConcurrentModificationException')){
                    console.warn('Retrying query because of ConcurrentModificationException');
                    return true;
                // Check for ReadOnlyViolationException
                if (err.message.includes('ReadOnlyViolationException')){
                    console.warn('Retrying query because of ReadOnlyViolationException');
                    return true;
                return false; 


The Python code uses the backoff module. Set pool_size=1 and message_serializer=serializer.GraphSONSerializersV2d0(). See the following code:

import os, sys, backoff, math
from random import randint
from gremlin_python import statics
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.driver.protocol import GremlinServerError
from gremlin_python.driver import serializer
from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.process.traversal import T
from tornado.websocket import WebSocketClosedError
from tornado import httpclient 
from botocore.auth import SigV4Auth
from botocore.awsrequest import AWSRequest
from botocore.credentials import ReadOnlyCredentials
from types import SimpleNamespace

reconnectable_err_msgs = [ 
    'Server disconnected',
    'Connection refused'

retriable_err_msgs = ['ConcurrentModificationException'] + reconnectable_err_msgs

network_errors = [WebSocketClosedError, OSError]

retriable_errors = [GremlinServerError] + network_errors      

def prepare_iamdb_request(database_url):
    service = 'neptune-db'
    method = 'GET'

    access_key = os.environ['AWS_ACCESS_KEY_ID']
    secret_key = os.environ['AWS_SECRET_ACCESS_KEY']
    region = os.environ['AWS_REGION']
    session_token = os.environ['AWS_SESSION_TOKEN']
    creds = SimpleNamespace(
        access_key=access_key, secret_key=secret_key, token=session_token, region=region,

    request = AWSRequest(method=method, url=database_url, data=None)
    SigV4Auth(creds, service, region).add_auth(request)
    return httpclient.HTTPRequest(database_url, headers=request.headers.items())
def is_retriable_error(e):

    is_retriable = False
    err_msg = str(e)
    if isinstance(e, tuple(network_errors)):
        is_retriable = True
        is_retriable = any(retriable_err_msg in err_msg for retriable_err_msg in retriable_err_msgs)
    print('error: [{}] {}'.format(type(e), err_msg))
    print('is_retriable: {}'.format(is_retriable))
    return is_retriable

def is_non_retriable_error(e):      
    return not is_retriable_error(e)
def reset_connection_if_connection_issue(params):
    is_reconnectable = False

    e = sys.exc_info()[1]
    err_msg = str(e)
    if isinstance(e, tuple(network_errors)):
        is_reconnectable = True
        is_reconnectable = any(reconnectable_err_msg in err_msg for reconnectable_err_msg in reconnectable_err_msgs)
    print('is_reconnectable: {}'.format(is_reconnectable))
    if is_reconnectable:
        global conn
        global g
        conn = create_remote_connection()
        g = create_graph_traversal_source(conn)
def query(**kwargs):
    id = kwargs['id']
    return (g.V(id)
            __.addV('User').property(, id)
def doQuery(event):
    return query(id=str(randint(0, 10000)))

def lambda_handler(event, context):
    return doQuery(event)
def create_graph_traversal_source(conn):
    return traversal().withRemote(conn)
def create_remote_connection():
    print('Creating remote connection')
    return DriverRemoteConnection(
def connection_string():
    database_url = 'wss://{}:{}/gremlin'.format(os.environ['NEPTUNE_ENDPOINT'], os.environ['NEPTUNE_PORT'])
    if 'USE_IAM' in os.environ and os.environ['USE_IAM'] == 'true':
        return prepare_iamdb_request(database_url)
        return database_url
conn = create_remote_connection()
g = create_graph_traversal_source(conn)


This post updates the recommendations around querying Neptune using a Gremlin client from a Lambda function. It’s now good practice to use a single WebSocket connection for the lifetime of a Lambda execution context, with the function handling connection issues and retrying connections as necessary. The post includes sample Lambda functions written in Java, JavaScript, and Python, which you can use as templates for your own functions.

For links to documentation, blog posts, videos, and code repositories containing other samples and tools, see Amazon Neptune resources.

Before you begin designing your database, we also recommend that you consult the AWS Reference Architectures for Using Graph Databases GitHub repo, where you can inform your choices about graph data models and query languages, and browse examples of reference deployment architectures.

About the author

Ian Robinson is a Principal Graph Architect with Amazon Neptune. He is a co-author of ‘Graph Databases’ and ‘REST in Practice’ (both from O’Reilly) and a contributor to ‘REST: From Research to Practice’ (Springer) and ‘Service Design Patterns’ (Addison-Wesley).