AWS Nitro Enclaves for secure blockchain key management: Part 3

In Part 1 of this series, we gave a high-level introduction to the AWS Nitro System and explained why Nitro is well suited for flexible and secure blockchain key management workloads.

In Part 2, we guided you through the steps to configure aspects like AWS Key Management Service (AWS KMS) key policies and how to sign your first Ethereum EIP-1559 transaction inside AWS Nitro Enclaves.

In this post, we dive deep into three areas:

Nitro Enclaves – How enclaves work and why they’re beneficial for critical blockchain operations.
Designing a Nitro Enclaves-based Ethereum signing application – How to architect an application to use Nitro Enclaves for secure processing, and how AWS Cloud Development Kit (AWS CDK) and Docker can support the deployment of its components.
Configuring a Nitro Enclaves runtime via Amazon EC2 user data – How to bootstrap the Nitro and Docker environment on an Amazon Elastic Compute Cloud (Amazon EC2) instance using EC2 user data, and deploy and monitor the application.

Solution overview

In this section, we take an in-depth look into Nitro Enclaves and how we need to architect our application so that it can benefit from the secure compute environment that Nitro Enclaves provides.

The goal of using Nitro Enclaves in a blockchain context is to take advantage of the isolated compute environment so that it’s exclusively exposed to the plaintext Ethereum private keys. Also, as depicted in the following figure, only the critical part of the application logic that requires access to the sensitive information (in this case the Ethereum private key) should be running inside the enclave—everything else should reside outside.

From a technical perspective, Nitro Enclaves can be seen as a fully isolated virtual machine running on a separate Nitro Hypervisor inside your EC2 instance.

Nitro Enclaves provides isolation by partitioning the CPU and memory of a single parent EC2 instance, and protects highly sensitive data against other users or applications that are running on the same instance. The environment is provably secure, and isn’t accessible to other applications, users, or processes running on the parent EC2 instance.

Each enclave runs an independent kernel and has exclusive access to memory and CPU resources. Enclaves have no external network connectivity, no persistent storage, and no user access, even with full AWS Identity and Access Management (IAM) permissions.

All data flowing into and out of an enclave moves across a local virtual socket vsock connection that ends on the EC2 instance.

As shown in the preceding diagram, you also need to use this local channel if you want your enclave to communicate with an external AWS services like AWS KMS. For additional information about Nitro Enclaves networking, refer to Secure local channel.

Even though the communication capabilities of Nitro Enclaves are limited to vsock only, it’s important to point out that critical information should never be exposed that way, for example, in a stack trace that is returned from the enclave via vsock.

Cryptographic attestation, the mechanism that ensures the validity of an enclave image file as well as the hash-based access control, requires the usage of the AWS Nitro Enclaves SDK inside the enclave.

As of this writing, the SDK is only available in the programming language C. If using C is not an option, you can use the KMS Tool binary inside the enclave as well.

Both the SDK and the binary enrich the request towards AWS KMS with the attestation document that contains the public key of the enclave as well as the hash values. This special, enriched request can then be evaluated by AWS KMS.

Requests towards AWS KMS that don’t originate from the SDK or KMS Tool don’t use the cryptographic attestation feature and will be rejected.

Also, if the --debug-mode option has been set when starting the enclave, cryptographic attestation is also not available. In this case, the PCR_0 value needs to be set to 000[...], otherwise AWS KMS will reject the request.

From a blockchain application point of view, the isolation property is useful because it allows secure key handling inside the application without risking revealing the key to a malicious actor, for example by getting access to a memory dump of the parent instance.

The cryptographic attestation feature allows the blockchain application developer or operator to apply fine-grained permission management on the level of enclave container hash values. That way, the integrity and validity of the blockchain application is ensured as well.

Prerequisites

To follow along with this post, we recommend reading Part 2 and deploying the described AWS CDK-based solution.

If you want to have the source code available on your local system without deploying the solution, you can clone the repository from GitHub:

git clone https://github.com/aws-samples/aws-nitro-enclave-blockchain-wallet.git

Nitro Enclaves-based Ethereum signing application

This section discusses how to design an application to use Nitro Enclaves, and how AWS CDK can support deploying these types of applications. As depicted in the solution diagram earlier, an application that wants to use Nitro Enclaves consists of two pieces: a server process that runs on the EC2 instance (for example, a Docker container), and an application that runs inside the enclave.

In this example, both components are built and deployed as Docker containers. That allows us to use AWS CDK to deploy the application.

In the next few sections, we deep dive into the http_server component, the signing_server (enclave) component, and how to manage Docker containers with AWS CDK.

http_server

As shown in the solution diagram, http_server is run as a Docker container on the EC2 host instance. Its sources are located in the ./application/eth1/server folder.

The http_server component has the simple responsibility of offering a secure interface for external processes like AWS Lambda functions or Amazon API Gateway. The http_server main application file is located at application/eth1/server/app.py. Requests to the http_server are restricted to the VPC only using EC2 security groups.

The application first spins up an HTTPS endpoint, which is protected by a self-signed x509 certificate that is passed into the Docker container:

httpd.socket = ssl.wrap_socket(httpd.socket,
                               server_side=True,
                               certfile='/etc/pki/tls/certs/localhost.crt',
                               ssl_version=ssl.PROTOCOL_TLS)

On the HTTPS endpoint, the http_server process waits for incoming POST requests containing the required parameters for an Ethereum asset transfer or smart contract invoke. As shown in the following code example, if the required parameters are present, the call_enclave() function is triggered:

def do_POST(self):
    content_length = int(self.headers['Content-Length'])
    post_data = self.rfile.read(content_length)
    […]
    plaintext_json = call_enclave(16, 5000, payload)

call_enclave() consumes the Context Identifier (CID), port, and payload. As explained in the Linux vsock manual, a socket address is defined as a combination of a 32-bit CID and a 32-bit port number. Therefore, the CID can be seen as similar to an IP address to connect to a process that is listening on a certain port, except that it’s limited to virtual machines that are running on the same Linux host.

The CID is incremented with every new enclave that is started if it hasn’t been provided as a start parameter. By default, the first enclave is assigned CID 16.

If the enclave is not capable of processing concurrent requests, scalability can be archived by starting several enclaves using different CIDs on the same parent instance. http_server can now load balance over these CIDs.

As depicted in the following code example, the encrypted Ethereum key is downloaded from AWS Secrets Manager, and the Amazon EC2 security credentials are downloaded from the Amazon EC2 metadata service:

payload["credential"] = get_aws_session_token()
payload["transaction_payload"] = enclave_payload["transaction_payload"]
payload["encrypted_key"] = encrypted_key

Then a new vsocket connection is opened using the provided parameters, CID, and port. The gathered parameters and the external payload are combined and passed to the enclave:

s = socket.socket(socket.AF_VSOCK, socket.SOCK_STREAM)
s.connect((cid, port))
s.send(str.encode(json.dumps(payload)))

The socket continues to listen and returns the result as soon as it’s returned by the enclave:

payload_processed = s.recv(1024).decode()
s.close()

return payload_processed

If successful, the signed transaction payload returned by the enclave is sent back as the result of the sign transaction POST request:

plaintext_json = call_enclave(16, 5000, payload)

self._set_response()
self.wfile.write(plaintext_json.encode("utf-8"))

signing_server

As depicted in the solution diagram, the signing_server component is run as a Nitro enclave on the EC2 parent instance. Its sources are located in the /application/eth1/enclave folder.

It’s important to point out that signing_server is a Dockerized Python application. In the Dockerfile, located in application/eth1/enclave/Dockerfile, the previously generated kmstool_enclave_cli and associated libnsm.so are copied into the Docker container. Furthermore, a few additional dependencies must be installed, like gcc or python3-devel. These are required for the compilation of the low-level crypto portion of the web3py library used in this example.

To avoid the installation of additional dependencies for the build process on the enclave Docker file, eventually resulting in smaller Docker container sizes, you can use Docker multi-stage builds.

The main application file of the signing_server (enclave) Docker image is application/eth1/enclave/server.py. After it starts, the application listens for incoming requests on the vsocket:

s.listen()

while True:
    c, addr = s.accept()

After a request has been sent on the vsocket connection defined by the CID, the payload is decoded. The enclave then uses the kms_call(credential, key_encrypted) method to decrypt the Ethereum key ciphertext, which has been passed by the http_server instance along with all the Ethereum transaction parameters.

Inside the kms_call() method, the kmstool_enclave_cli binary is called:

subprocess_args = [
    "/app/kmstool_enclave_cli",
    "--region", os.getenv("REGION", "us-east-1"),
    "--proxy-port", "8000",
    "--aws-access-key-id", aws_access_key_id,
    "--aws-secret-access-key", aws_secret_access_key,
    "--aws-session-token", aws_session_token,
    "--ciphertext", ciphertext,
[…]
proc = subprocess.Popen(
    subprocess_args,
    stdout=subprocess.PIPE
)

kmstool_enclave_cli consumes the Ethereum key ciphertext and turns it into an AWS KMS decrypt request augmented with the cryptographic attestation document from inside the enclave.

It’s important to point out that the KMS key ID doesn’t need to be specified for the decrypt operation because it’s already contained in the ciphertext metadata.

If successfully decrypted, the plain text Ethereum key is passed to the web3 transaction signing method along with the transaction parameters:

transaction_signed = w3.eth.account.sign_transaction(transaction_dict, key_plaintext)

The signed transaction payload and hash value are then returned via the vsocket connection:

c.send(str.encode(json.dumps(response_plaintext)))

Manage Docker containers with AWS CDK

As shown in the following code example taken from the AWS CDK main stack definition located in ./nitro_wallet/nitro_walltet_stack.py, http_server and signing_server are registered as DockerImageAsset:

signing_server_image = aws_ecr_assets.DockerImageAsset(self, "EthereumSigningServerImage",
directory="./application/{}/server".format(application_type),
build_args={"REGION_ARG": region})

In a later step, the pull permission is granted to the EC2 instance role:

signing_enclave_image.repository.grant_pull(role)

By doing so, the Docker build command is run locally on the machine used to deploy the AWS CDK stack using the cdk deploy devNitroWalletEth command.

If successful, the Docker images are then pushed to an Amazon Elastic Container Registry (Amazon ECR) instance located in the specified Region.

The grant_pull(role) permissions that have been added allow the EC2 instance role to request Docker pull credentials later on, as explained in the next section.

Configure the Nitro runtime via EC2 user data

After we create our Nitro enclave, the next step is to create the Amazon EC2 configuration and deployment.

In this example, the configuration and deployment is handled based on AWS CDK infrastructure and EC2 user data scripts. It’s also possible to automate these steps, for example with AWS OpsWorks, AWS CodeDeploy, or other services.

User data scripts can be passed along with the EC2 instance. They’re run at the initial EC2 instance creation. By default, they’re not run during subsequent restarts. If you need to run user data with every start, for example to start processes, you can configure it to do so. For more information, refer to How can I utilize user data to automatically run a script with every restart of my Amazon EC2 Linux instance.

The user data script is located in /user_data/user_data.sh. It’s also important to point out that user data is limited to 16 KB.

As shown in the following code snippet, user data is treated like a standard bash script, and therefore needs to start with the #! characters and the path to the interpreter. User data scripts are run with root user permissions, so no sudo command is required. For more general information about user data, refer to User data and shell scripts.

#!/bin/bash
exec > >(tee /var/log/user-data.log | logger -t user-data -s 2>/dev/console) 2>&1

The second line in the preceding code example sends user data output to the console logs on the running EC2 instance, and is therefore very helpful for analyzing and debugging user data scripts.

In this section, we walk through the steps listed in the user data script and that are run automatically during the initial launch of an EC2 instance to configure and run the required application components. The high-level steps are as follows:

Install the required packages and service configuration on the EC2 instance.
Configure Nitro Enclaves.
Deploy the application Docker containers and create an enclave.
Provision Nitro Enclaves monitoring and start the signing_server enclave.
Generate a certificate and start http_server.

Install the required packages and service configuration for Nitro Enclaves

As shown in the following code example, the operating system is instructed to install new packages and updates. EC2 instances that are prepared for Nitro Enclaves need docker installed and aws-nitro-enclaves-cli.

If EC2 instances don’t have outgoing access to the internet, you can use an AMI that has these packages preinstalled. You can generate custom AMIs from running EC2 instances (for example, using EC2 Image Builder).

Along with the installation of required packages, systemctl is used to enable and start the Docker service. Enabled systemd services are started automatically after each restart of the EC2 instance.

Also, the user ec2-user is added to the Docker and ne (Nitro Enclaves) group, and therefore grants permission to access the related services:

amazon-linux-extras install docker
amazon-linux-extras enable aws-nitro-enclaves-cli
yum install -y aws-nitro-enclaves-cli aws-nitro-enclaves-cli-devel htop git mod_ssl

usermod -aG ne ec2-user

Configure Nitro Enclaves

By default, nitro-enclaves-allocator allocates 2048 MB of memory from its EC2 parent instance to run enclaves in. The amount of required memory depends on the size of the enclave .eif file. The example discussed in this post requires 4096 MB of memory. If the enclave doesn’t have sufficient memory available, an error is thrown along with the minimum memory requirements.

ALLOCATOR_YAML=/etc/nitro_enclaves/allocator.yaml
[…]
DEFAULT_MEM=4096
DEFAULT_CPU=2
[…]
systemctl start nitro-enclaves-allocator.service
systemctl enable nitro-enclaves-allocator.service

In addition to nitro-enclaves-allocator, the provided nitro-enclaves-vsock-proxy service is started and enabled. The purpose of nitro-enclaves-vsock-proxy is explained in Part 1.

For more information on the vsock_proxy binary, refer to Vsock Proxy.

Deploy application Docker containers and create an enclave

The next code snippet is deployment related. First, a separate shell script is saved on the EC2 instance file system in /home/ec2-user/app/server/build_signing_server_enclave.sh. The end of the shell script that is saved on the file system is indicated by EOF.

cd /home/ec2-user

if [[ ! -d ./app/server ]]; then
  mkdir -p ./app/server

  cd ./app/server
  cat <<'EOF' >>build_signing_server_enclave.sh
[…]
EOF
[…]
  fi

In the build_signing_server_enclave.sh script, the AWS account ID is determined first by using the AWS Command Line Interface (AWS CLI) to run the aws sts-get-caller-identity command. The output is then filtered for Account using jq.

The Region inside the script is determined by querying the EC2 instance metadata service and filtered for region via jq.

account_id=$( aws sts get-caller-identity | jq -r '.Account' )
region=$( curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r '.region' )

The information about the account and the Region that the EC2 instance is located in is then used to fetch a Docker authentication token for the ECR repository that is used for the AWS CDK Docker images discussed in the previous section:

aws ecr get-login-password --region $region | docker login --username AWS --password-stdin $account_id.dkr.ecr.$region.amazonaws.com

The authenticated Docker client is then used to pull the server and enclave Docker image. It’s important to point out that the placeholder values __SIGNING_SERVER_IMAGE_URI__ and __SIGNING_ENCLAVE_IMAGE_URI__ have been replaced by the proper Docker image URIs in the entire user-data.sh file already. For more information about this variable substitution, see Part 1.

docker pull ${__SIGNING_SERVER_IMAGE_URI__}
docker pull ${__SIGNING_ENCLAVE_IMAGE_URI__}

After the Docker files are available on the local file system, nitro-cli is instructed to turn the enclave Docker image into a Nitro Enclaves .eif file:

nitro-cli build-enclave --docker-uri ${__SIGNING_ENCLAVE_IMAGE_URI__} --output-file signing_server.eif

After the script is persisted on the Amazon EC2 file system, the run permissions are added and ownership is granted to ec2-user via chown. The script is then run with ec2-user permissions:

sudo -H -u ec2-user bash -c "cd /home/ec2-user/app/server && ./build_signing_server_enclave.sh"

Provision Nitro Enclaves monitoring and start the signing_server enclave

Here, a simple systemd service definition is added to the EC2 instance file system, along with a simple enclave watchdog script. The systemd service definition is stored in /etc/systemd/system/nitro-signing-server.service.

The same provisioning pattern is applied as explained in the previous section.

The following service definition instructs systemd to start and monitor the Python script located in /home/ec2-user/app/watchdog.py:

[Service]
Type=simple
ExecStart=/home/ec2-user/app/watchdog.py
Restart=always

The systemd service starts and constantly monitors the watchdog.py process. The watchdog.py process itself is responsible for starting and monitoring the enclave.

The following code example shows that watchdog.py calls the nitro-cli run-enclave command to initially start the enclave:

def nitro_cli_run_call():
    subprocess_args = [
        "/bin/nitro-cli",
        "run-enclave",
        "--cpu-count", "2",
        "--memory", "3806",
        "--eif-path", "/home/ec2-user/app/server/signing_server.eif",
	"--enclave-cid", "16"
    ]

After the initial start of the enclave, watchdog.py changes into a monitoring loop, where it checks the status of the signing_server enclave every 5 seconds:

nitro_cli_run_call()

while nitro_cli_describe_call("signing_server"):
    time.sleep(5)

In this example, the enclave monitoring step is conducted by checking if an enclave with the name signing_server exists and if its state is running. Because the enclave is running in an isolated environment, you can’t use Docker or systemd directly to keep the enclave up and running.

The status of the enclave is determined by a nitro-cli describe-enclaves call:

def nitro_cli_describe_call(name=None):
    subprocess_args = [
        "/bin/nitro-cli",
        "describe-enclaves"
    ]

If the signing_server enclave isn’t available, watchdog.py is stopped. systemd recognizes that the process has been stopped and instantly restarts it.

To make sure that watchdog.py and the associated signing_server enclave are automatically started after the EC2 instance has been restarted, nitro-signing-server.service needs to be registered with systemctl:

systemctl start nitro-signing-server.service
systemctl enable nitro-signing-server.service

Generate a certificate and start http_server

The last code snippet consists of two statements. First, a self signed x509 certificate is created for the HTTPS endpoint of http_server. Please note, for a production setup, a valid x509 cert must be used.

Second, the http_server Docker container is started using the docker run command. The local /etc/pki/tls/certs folder is mounted into the container to make the self-signed x509 certificate available as well. Also, the --restart unless-stopped flag is passed. That means Docker automatically restarts the container after the host EC2 instance has been restarted. Docker recommends using the Docker daemon-based approach over external process managers such as systemd.

cd /etc/pki/tls/certs
./make-dummy-cert localhost.crt

docker run -d --restart unless-stopped --name http_server -v /etc/pki/tls/certs/:/etc/pki/tls/certs/ -p 443:443 ${__SIGNING_SERVER_IMAGE_URI__}

After the initial user data script has run successfully, the EC2 instance has been configured for Nitro Enclaves. Also, the enclave and http_server are running.

Clean up

To avoid incurring future charges, delete the resources using the AWS CDK with the following command:

cdk destroy

You can also delete the stacks deployed by the AWS CDK via the AWS CloudFormation console.

Conclusion

In this post, we took a deep dive into Nitro Enclaves, how the cryptographic attestation feature works, and how to use it via the kmstool_enclave_cli binary. We also explained what an application architecture using Nitro Enclaves should look like and how AWS CDK can support the deployment process.

We ended with a deep dive into the Nitro Enclaves requirements and an explanation of how to use EC2 user data for the configuration and deployment of Nitro Enclaves.

You can now deploy a modified version of the Nitro enclave and sign your first customized Nitro Enclaves-backed Ethereum transaction!

About the Author

David Dornseifer is a Blockchain Architect with the Amazon ProServe Blockchain team. He focuses on helping customers design, deploy and scale end-to-end Blockchain solutions.

AWS Web3 Blog