AWS Compute Blog

Building a secure webhook forwarder using an AWS Lambda extension and Tailscale

This post is written by Duncan Parsons, Enterprise Architect, and Simon Kok, Sr. Consultant.

Webhooks can help developers to integrate with third-party systems or devices when building event based architectures.

However, there are times when control over the target’s network environment is restricted or targets change IP addresses. Additionally, some endpoints lack sufficient security hardening, requiring a reverse proxy and additional security checks to inbound traffic from the internet.

It can be complex to set up and maintain highly available secure reverse proxies to inspect and send events to these backend systems for multiple endpoints. This blog shows how to use AWS Lambda extensions to build a cloud native serverless webhook forwarder to meet this need with minimal maintenance and running costs.

The custom Lambda extension forms a secure WireGuard VPN connection to a target in a private subnet behind a stateful firewall and NAT Gateway. This example sets up a public HTTPS endpoint to receive events, selectively filters, and proxies requests over the WireGuard connection. This example uses a serverless architecture to minimize maintenance overhead and running costs.

Example overview

The sample code to deploy the following architecture is available on GitHub. This example uses AWS CodePipeline and AWS CodeBuild to build the code artifacts and deploys this using AWS CloudFormation via the AWS Cloud Development Kit (CDK). It uses Amazon API Gateway to manage the HTTPS endpoint and the Lambda service to perform the application functions. AWS Secrets Manager stores the credentials for Tailscale.

To orchestrate the WireGuard connections, you can use a free account on the Tailscale service. Alternatively, set up your own coordination layer using the open source Headscale example.

Reference architecture

  1. The event producer sends an HTTP request to the API Gateway URL.
  2. API Gateway proxies the request to the Lambda authorizer function. It returns an authorization decision based on the source IP of the request.
  3. API Gateway proxies the request to the Secure Webhook Forwarder Lambda function running the Tailscale extension.
  4. On initial invocation, the Lambda extension retrieves the Tailscale Auth key from Secrets Manager and uses that to establish a connection to the appropriate Tailscale network. The extension then exposes the connection as a local SOCKS5 port to the Lambda function.
  5. The Lambda extension maintains a connection to the Tailscale network via the Tailscale coordination server. Through this coordination server, all other devices on the network can be made aware of the running Lambda function and vice versa. The Lambda function is configured to refuse incoming WireGuard connections – read more about the --shields-up command here.
  6. Once the connection to the Tailscale network is established, the Secure Webhook Forwarder Lambda function proxies the request over the internet to the target using a WireGuard connection. The connection is established via the Tailscale Coordination server, traversing the NAT Gateway to reach the Amazon EC2 instance inside a private subnet. The EC2 instance responds with an HTML response from a local Python webserver.
  7. On deployment and every 60 days, Secrets Manager rotates the Tailscale Auth Key automatically. It uses the Credential Rotation Lambda function, which retrieves the OAuth Credentials from Secrets Manager and uses these to create a new Tailscale Auth Key using the Tailscale API and stores the new key in Secrets Manager.

To separate the network connection layer logically from the application code layer, a Lambda extension encapsulates the code required to form the Tailscale VPN connection and make this available to the Lambda function application code via a local SOCK5 port. You can reuse this connectivity across multiple Lambda functions for numerous use cases by attaching the extension.

To deploy the example, follow the instructions in the repository’s README. Deployment may take 20–30 minutes.

How the Lambda extension works

The Lambda extension creates the network tunnel and exposes it to the Lambda function as a SOCKS5 server running on port 1055. There are three stages of the Lambda lifecycle: init, invoke, and shutdown.

Lambda extension deep dive

With the Tailscale Lambda extension, the majority of the work is performed in the init phase. The webhook forwarder Lambda function has the following lifecycle:

  1. Init phase:
    1. Extension Init – Extension connects to Tailscale network and exposes WireGuard tunnel via local SOCKS5 port.
    2. Runtime Init – Bootstraps the Node.js runtime.
    3. Function Init – Imports required Node.js modules.
  2. Invoke phase:
    1. The extension intentionally doesn’t register to receive any invoke events. The Tailscale network is kept online until the function is instructed to shut down.
    2. The Node.js handler function receives the request from API Gateway in 2.0 format which it then proxies to the SOCKS5 port to send the request over the WireGuard connection to the target. The invoke phase ends once the function receives a response from the target EC2 instance and optionally returns that to API Gateway for onward forwarding to the original event source.
  3. Shutdown phase:
    1. The extension logs out of the Tailscale network and logs the receipt of the shutdown event.
    2. The function execution environment is shut down along with the Lambda function’s execution environment.

Extension file structure

The extension code exists as a zip file along with some metadata set at the time the extension is published as an AWS Lambda layer. The zip file holds three folders:

  1. /extensions – contains the extension code and is the directory that the Lambda service looks for code to run when the Lambda extension is initialized.
  2. /bin –includes the executable dependencies. For example, within the tsextension.sh script, it runs the tailscale, tailscaled, curl, jq, and OpenSSL binaries.
  3. /ssl –stores the certificate authority (CA) trust store (containing the root CA certificates that are trusted to connect with). OpenSSL uses these to verify SSL and TLS certificates.

The tsextension.sh file is the core of the extension. Most of the code is run in the Lambda function’s init phase. The extension code is split into three stages. The first two stages relate to the Lambda function init lifecycle phase, with the third stage covering invoke and shutdown lifecycle phases.

Extension phase 1: Initialization

In this phase, the extension initializes the Tailscale connection and waits for the connection to become available.

The first step retrieves the Tailscale auth key from Secrets Manager. To keep the size of the extension small, the extension uses a series of Bash commands instead of packaging the AWS CLI to make the Sigv4 requests to Secrets Manager.

The temporary credentials of the Lambda function are made available as environment variables by the Lambda execution environment, which the extension uses to authenticate the Sigv4 request. The IAM permissions to retrieve the secret are added to the Lambda execution role by the CDK code. To optimize security, the secret’s policy restricts reading permissions to (1) this Lambda function and (2) Lambda function that rotates it every 60 days.

The Tailscale agent starts using the Tailscale Auth key. Both the tailscaled and tailscale binaries start in userspace networking mode, as each Lambda function runs in its own container on its own virtual machine. More information about userspace networking mode can be found in the Tailscale documentation.

With the Tailscale processes running, the process must wait for the connection to the Tailnet (the name of a Tailscale network) to be established and for the SOCKS5 port to be available to accept connections. To accomplish this, the extension simply waits for the ‘tailscale status’ command not to return a message with ‘stopped’ in it and then moves on to phase 2.

Extension phase 2: Registration

The extension now registers itself as initialized with the Lambda service. This is performed by sending a POST request to the Lambda service extension API with the events that should be forwarded to the extension.

The runtime init starts next (this initializes the Node.js runtime of the Lambda function itself), followed by the function init (the code outside the event handler). In the case of the Tailscale Lambda extension, it only registers the extension to receive ‘SHUTDOWN’ events. Once the SOCKS5 service is up and available, there is no action for the extension to take on each subsequent invocation of the function.

Extension phase 3: Event processing

To signal the extension is ready to receive an event, a GET request is made to the ‘next’ endpoint of the Lambda runtime API. This blocks the extension script execution until a SHUTDOWN event is sent (as that is the only event registered for this Lambda extension).

When this is sent, the extension logs out of the Tailscale service and the Lambda function shuts down. If INVOKE events are also registered, the extension processes the event. It then signals back to the Lambda runtime API that the extension is ready to receive another event by sending a GET request to the ‘next’ endpoint.

Access control

A sample Lambda authorizer is included in this example. Note that it is recommended to use the AWS Web Application Firewall service to add additional protection to your public API endpoint, as well as hardening the sample code for production use.

For the purposes of this demo, the implementation demonstrates a basic source IP CIDR range restriction, though you can use any property of the request to base authorization decisions on. Read more about Lambda authorizers for HTTP APIs here. To use the source IP restriction, update the CIDR range of the IPs you want to accept on the Lambda authorizer function AUTHD_SOURCE_CIDR environment variable.

Costs

You are charged for all the resources used by this project. The NAT Gateway and EC2 instance are destroyed by the pipeline once the final pipeline step is manually released to minimize costs. The AWS Lambda Power Tuning tool can help find the balance between performance and cost while it polls the demo EC2 instance through the Tailscale network.

The following result shows that 256 MB of memory is the optimum for the lowest cost of execution. The cost is estimated at under $3 for 1 million requests per month, once the demo stack is destroyed.

Power Tuning results

Conclusion

Using Lambda extensions can open up a wide range of options to extend the capability of serverless architectures. This blog shows a Lambda extension that creates a secure VPN tunnel using the WireGuard protocol and the Tailscale service to proxy events through to an EC2 instance inaccessible from the internet.

This is set up to minimize operational overhead with an automated deployment pipeline. A Lambda authorizer secures the endpoint, providing the ability to implement custom logic on the basis of the request contents and context.

For more serverless learning resources, visit Serverless Land.