AWS Nitro Enclaves for running Ethereum validators – Part 2

In Part 1 of this series, we gave a high-level introduction to an AWS Nitro Enclaves-based Web3Signer blockchain validation and signing service. We explained the purpose of running blockchain validators and also why Nitro Enclaves are well suited to run security sensitive cryptographic workloads.

We furthermore covered the high-level application architecture and briefly explained the secure bootstrapping and secure signing flow.

In the walkthrough on GitHub, we gave an end-to-end example of how to deploy and configure a Nitro Enclave-based Web3Signer solution.

In this post, we dive deep into three areas:

Web3Signer integration pattern – How different Ethereum validator nodes can be integrated with a single Web3Signer node
Secure bootstrapping of Web3Signer inside a Nitro Enclave – How security sensitive configuration files like private keys can be injected into a Nitro Enclave in a secure fashion and how these config artifacts can be used for an HTTPS endpoint
Exposing Web3Signer HTTPS API over vsock – How Transport Layer Security (TLS) traffic can be tunnelled over a vsock connection in a transparent way to establish secure communication with the Web3Signer service running inside a Nitro Enclave

Integration with Web3Signer for Ethereum validation

To be able to integrate with Web3Signer, validator clients need to support connection to BLS remote signers (EIP-3030). Popular validator clients such as Lighthouse, Prysm, and Teku already support remote signing. Multiple validator clients can connect to the same Web3Signer instances behind a load balancer. It’s critical that the validator clients not be configured to use the same validator keys, which can result in the validator being slashed. Validator client slashing protection is not effective when multiple validator clients are running simultaneously. For instance, the following diagram shows Validator Client 1 using the validator keys 0xa957cf9 and 0xb821c31, which means Validator Client 2 can no longer use those two keys. Instead, Validator Client 2 uses the key 0x8f1942c, which isn’t used by any of the validator clients.

In the accompanying AWS Cloud Development Kit (AWS CDK) code, Web3Signer isn’t configured to use client authentication. However, for production deployments, Web3Signer should be configured to use client authentication with one of the following methods:

Generate a certificate for each validator client – Create a known clients file with the certificates and load it into Web3Signer. This method is operationally complex but revocation is possible because we only need to modify the known clients file in Web3Signer.
Configure Web3Signer to allow validator clients with a trusted CA certificate to connect – This method is operationally easier; however, there is a need to maintain a private certificate authority (CA) server to issue the certificates for each validator client. At the time of writing, the revocation feature is still under development.

For more information on client authentication, refer to Configure-TLS.

Secure bootstrapping of Web3Signer inside a Nitro Enclave

To run a Web3Signer service inside a Nitro Enclave in a secure fashion, the following two config artifacts are required:

A TLS key pair to secure the Web3Signer HTTPS API
EIP-2335 specified keystore files, containing encrypted BLS12-381 validator private keys

Both config artifacts need to be encrypted via a symmetric AWS Key Management Service (AWS KMS) key.

The KMS key that is being used for the symmetric encryption of the config artifacts needs to have the key policy configured in a way that KMS:decrypt operations, originating from inside the enclave using cryptographic attestation, are being authorized. For additional details how to configure the KMS key policy for this solution, refer to the solution walkthrough on GitHub.

In this solution, we use two Amazon DynamoDB tables to store the config artifacts: a table for TLSKeys (TLSKeyTable) and a separate table for validator keys (ValidatorKeyTable).

TLSKeyTable uses the following schema:

{
	"key_id": "<>",
	“cert_pem”: "<base64 encloded x509 certificate>",
	“encrypted_tls_key_b64”: "<encrypted private key>" 
}

ValidatorKeyTable enforces the following schema:

{
	“Web3Signer_uuid”: “<stack uuid>”,
	“pubkey”: “<public key>”,
	“active”: "<true/false>",
	“chain”: “<ethereum network>”,
	“datetime”: “<time of creation>”,
	“deposit_json_b64”: “<>”,
	“encrypted_key_password_mnemonic_b64”: “<>”
}

As we pointed out before, to be able to start the bootstrapping process, the TLS key pair and the EIP-2335 keystore file need to be encrypted via the symmetric KMS key and the resulting ciphertexts need to be stored in the DynamoDB table.

By aligning on the schema definitions, different secure Web3Signer processes can be started, each with a unique TLS key and different (>=1) validator keys associated by just passing a single TLS key_id and array of Web3Signer_uuid keys as parameters for the bootstrap process.

The following diagram depicts the bootstrapping flow.

The flow consists of eight steps:

The nitro-signing-server systemd service the starts watchdog.py process. The watchdog process reads the encrypted Web3Signer config assets (TLS keys, validator keys) from the DynamoDB tables.
The watchdog process starts the Web3Signer enclave. As soon as the enclave is up and running, the watchdog process calls the init operation of the enclave by sending the following payload. encrypted_tls_key and encrypted_validator_keys contain the config artifacts downloaded from the DynamoDB tables. credential represents temporary AWS security credentials gathered by the watchdog process.
```
payload = {
    "operation": "init",
    "credential": credential,
    "encrypted_tls_key": encrypted_tls_key,
    "encrypted_validator_keys": encrypted_validator_keys,
}
```
The enclave_init process running inside the Web3Signer enclave validates the incoming init request, ensuring that all required parameters have been enclosed. If all required parameters are present, kmstool-enclave-cli is used to decrypt the TLS and validator keys that are passed to the Web3Signer enclave.
kmstool-enclave-cli establishes a secure outbound connection to AWS KMS via vsock-proxy, using cryptographic attestation to be authorized to decrypt the configuration artifacts.
If the KMS:decrypt operation using cryptographic attestation was successful, the decrypted files are converted into the right file format and written to the enclave’s in-memory file system. The enclave_init process starts the Web3Signer process.
The Web3Signer process reads the config files, decrypts the encrypted private key in the keystore file, and starts the HTTPS listener on the enclave’s localhost interface.
The validator client must have Web3Signer configured as its remote signing solution, for example Lighthouse remote signing, as explained earlier in this post.
When configured correctly, the validator client can connect to the Web3Signer HTTPS endpoint exposed via https_proxy. For additional details on how to tunnel HTTPS traffic over a vsock connection, refer to the next section in this post.

Due to the transient nature of the Nitro Enclave’s in-memory file system, the bootstrapping process is run every time the enclave or the parent Amazon Elastic Compute Cloud (Amazon EC2) instance is stopped and rebooted.

Exposing the Web3Signer HTTPS API over vsock

Vsock is a local communication channel between an EC2 parent instance and its enclaves. It is the only channel of communication that an enclave can use to interact with external services.

To establish a secure connection, for example using TLS, with the Nitro Enclave from inside the parent instance or from an AWS Lambda function running in the same Amazon VPC as the EC2 parent instance, it’s required to forward encrypted traffic over the vsock connection.

The following two sections explain in detail how HTTPS (TLS) inbound and outbound connections can be established using the vsock proxy processes. The numbers are associated with the different states in the following diagram.

The diagram also depicts the listener/port configuration of all involved components configured during the AWS CDK deployment process.

Representations like 16:5000 refer to the vsock listener where the 16 represents the 32-bit Context Identifier (CID) and 5000 specifies the port. A tuple like 0.0.0.0:443 refers to the internet socket listener where 0.0.0.0 refers to an IPv4 address and 443 specifies the port.

HTTPS inbound flow

The HTTPS inbound flow has the following steps:

In the given solution, https_proxy acts like a TCP proxy that translates between AF_INET and AF_VSOCK. It accepts incoming internet socket requests from the validator client on 0.0.0.0:443.
https_proxy establishes a vsock connection with a separate vsock_proxy process running inside the enclave listening on 16:5000. It then forwards all received TCP packets via this connection.
For each incoming vsock connection, vsock_proxy establishes an internet TCP connection with Web3Signer HTTPS API, which listens on 127.0.0.1:9000.
Web3Signer responds via the established TCP connection.
After receiving the Web3Signer response, vsock_proxy forwards the TCP packets via the established vsock connection to https_proxy running on the parent instance.
https_proxy closes the vsock connection and forwards the TCP packages back to the requesting client.

HTTPS outbound flow

Outbound HTTPS connections originating from the kmstool-enclave-cli binary pointing to AWS KMS use a similar mechanism as explained earlier. The steps are as follows:

kmstool-enclave-cli established a vsock connection with the vsock_proxy process running on the EC2 parent instance listening on 3:8000.
vsock_proxy establishes a TCP connection with AWS KMS.
AWS KMS responds to vsock_proxy via the established TCP connection.
vsock_proxy forwards AWS KMS’s response via the open vsock connection to the enclave. kmstool-enclave-cli closes the vsock connection.

With this mechanism, HTTPS requests can be securely proxied into and out of the enclave.

It’s important to point out that the X509 certificate used to verify the Web3Signer endpoints needs the hostname of the parent EC2 instance to be set as CN, otherwise the hostname validation will fail.

Additionally, using this HTTPS tunnel mechanism and having Web3Signer stop the TLS session inside the Nitro Enclave can lead to additional CPU load due to cryptographic operations.

Besides protecting the sensitive configuration payload like validator private keys, required by Web3Signer inside a Nitro Enclave, new potential attack vectors are being introduced by exposing the Web3Signer HTTPS API. Authentication becomes critical because the HTTPS endpoint is being exposed to the private subnet. We already elaborated earlier in this post on the two different authentication options supported by Web3Signer. Also, it’s important to be aware of downstream dependencies used to build Web3Signer, for example, to avoid the exploitation of issues like the Heartbleed OpenSSL bug.

Different tools and open-source libraries are available that you can use to tunnel TLS connections over vsock:

traffic-forwarder.py – The script used in the AWS Nitro Enclaves Workshop
viproxy – An open-source Go(lang) library on GitHub
socat – a versatile Linux networking utility

Conclusion

In this post, we provided in-depth explanation of how to integrate Web3Signer with validator clients. We also dived deep into the secure bootstrapping process and ended with a detailed explanation of how to expose the Web3Signer HTTPS API via vsock.

You can customize the solution and deploy your own Nitro Enclave-secured validator node on AWS!

About the Authors

David-Paul Dornseifer is a Blockchain Architect with the AWS Worldwide Specialist Solutions Architect organization. He focuses on helping customers design, deploy, and scale end-to-end blockchain solutions.

Aldred Halim is a Solutions Architect with the AWS Worldwide Specialist Solutions Architect organization. He works closely with customers in designing architectures and building components to ensure success in running blockchain workloads on AWS.

AWS Database Blog