Round 2 Hybrid Post-Quantum TLS Benchmarks

January 25, 2023: AWS KMS, ACM, Secrets Manager TLS endpoints have been updated to only support NIST’s Round 3 picked KEM, Kyber. s2n-tls and s2n-quic have also been updated to only support Kyber. BIKE or other KEMs may still be added as the standardization proceeds.

AWS Cryptography has completed benchmarks of Round 2 Versions of the Bit Flipping Key Encapsulation (BIKE) and Supersingular Isogeny Key Encapsulation (SIKE) hybrid post-quantum Transport Layer Security (TLS) Algorithms. Both of these algorithms have been submitted to the National Institute of Standards and Technology (NIST) as part of NIST’s Post-Quantum Cryptography standardization process.

In the first hybrid post-quantum TLS blog, we announced that AWS Key Management Service (KMS) had launched support for hybrid post-quantum TLS 1.2 using Round 1 versions of BIKE and SIKE. In this blog, we are announcing AWS Cryptography’s benchmark results of using Round 2 versions of BIKE and SIKE with hybrid post-quantum TLS 1.2 against an HTTP webservice. Round 2 versions of BIKE and SIKE include performance improvements, parameter tuning, and algorithm updates in response to NIST’s comments on Round 1 versions. I’ll give a refresher on hybrid post-quantum TLS 1.2, go over our Round 2 hybrid post-quantum TLS 1.2 benchmark results, and then describe our benchmarking methodology.

This blog post is intended to inform software developers, AWS customers, and cryptographic researchers about the potential upcoming performance differences between classical and hybrid post-quantum TLS.

Refresher on Hybrid Post-Quantum TLS 1.2

Some of this section is repeated from the previous hybrid post-quantum TLS 1.2 launch announcement for KMS. If you are already familiar with hybrid post-quantum TLS, feel free to skip to the Benchmark Results section.

What is Hybrid Post-Quantum TLS 1.2?

Figure 1: Differences in the master secret derivation process between classical and hybrid post-quantum TLS 1.2

Hybrid post-quantum TLS 1.2 is a proposed extension to the TLS 1.2 Protocol implemented by Amazon’s open source TLS library s2n that provides the security protections of both the classical and post-quantum schemes. It does this by performing two independent key exchanges (one classical and one post-quantum), and then cryptographically combining both keys into a single TLS master secret.

Why is Post-Quantum TLS Important?

Hybrid post-quantum TLS allows connections to remain secure even if one of the key exchanges (either classical or post-quantum) performed during the TLS Handshake is compromised in the future. For example, if a sufficiently large-scale quantum computer were to be built, it could break the current classical public-key cryptography that is used for key exchange in every TLS connection today. Encrypted TLS traffic recorded today could be decrypted in the future with a large-scale quantum computer if post-quantum TLS is not used to protect it.

Round 2 Hybrid Post-Quantum TLS Benchmark Results

Figure 2: Latency in relation to HTTP request count for four key exchange algorithms

Key Exchange Algorithm	Server PQ Implementation	TLS Handshake + 1 HTTP Request	TLS Handshake + 2 HTTP Requests	TLS Handshake + 10 HTTP Requests	TLS Handshake + 25 HTTP Requests
ECDHE Only	N/A	10.8 ms	15.1 ms	52.6 ms	124.2 ms
ECDHE + BIKE1‑CCA‑L1‑R2	C	19.9 ms	24.4 ms	61.4 ms	133.2 ms
ECDHE + SIKE‑P434‑R2	C	169.6 ms	180.3 ms	219.1 ms	288.1 ms
ECDHE + SIKE‑P434‑R2	x86-64 Assembly	20.1 ms	24.5 ms	62.0 ms	133.3 ms

Table 1 shows the time (in milliseconds) that a client and server in the same region take to complete a TCP Handshake, a TLS Handshake, and complete varying numbers of HTTP Requests sent to an HTTP web service running on an i3en.12xlarge host.

Key Exchange Algorithm	Client Hello	Server Key Exchange	Client Key Exchange	Other	TLS Handshake Total
ECDHE Only	218	338	75	2430	3061
ECDHE + BIKE1‑CCA‑L1‑R2	220	3288	3023	2430	8961
ECDHE + SIKE‑P434‑R2	214	672	423	2430	3739

Table 2 shows the amount of data (in bytes) used by different messages in the TLS Handshake for each Key Exchange algorithm.

	1 HTTP Request	2 HTTP Requests	10 HTTP Requests	25 HTTP Requests
HTTP Request Bytes	878	1,761	8,825	22,070
HTTP Response Bytes	698	1,377	6,809	16,994
Total HTTP Bytes	1576	3,138	15,634	39,064

Table 3 shows the amount of data (in bytes) sent and received through each TLS connection for varying numbers of HTTP requests.

Benchmark Results Analysis

In general, we find that the major trade off between BIKE and SIKE is data usage versus processing time, with BIKE needing to send more bytes but requiring less time processing them, and SIKE making the opposite trade off of needing to send fewer bytes but requiring more time processing them. At the time of integration for our benchmarks, an x86-64 assembly optimized implementation of BIKE1-CCA-L1-R2 was not available in s2n.

Our results show that when only a single HTTP request is sent, completing a BIKE1-CCA-L1-R2 hybrid TLS 1.2 handshake takes approximately 84% more time compared to a non-hybrid TLS connection, and completing an x86-64 assembly optimized SIKE-P434-R2 hybrid TLS 1.2 handshake takes approximately 86% more time than non-hybrid. However, at 25 HTTP Requests per TLS connection, when using the fastest available implementation for both BIKE and SIKE, the increased TLS Handshake latency is amortized, and only 7% more total time is needed for both BIKE and SIKE compared to a classical TLS connection.

Our results also show that BIKE1-CCA-L1-R2 hybrid TLS Handshakes used 5900 more bytes than a classical TLS Handshake, while SIKE-P434-R2 hybrid TLS Handshakes used 678 more bytes than classical TLS.

In the AWS EC2 network, using modern x86-64 CPU’s with the fastest available algorithm implementations, we found that BIKE and SIKE performed similarly, with their maximum latency difference being only 0.6 milliseconds apart, and BIKE being the faster of the two in every benchmark. However when compared to SIKE’s C implementation, which would be used on hosts without the ADX and MULX x86-64 instructions used by SIKE’s assembly implementation, BIKE performed significantly better, seeing a maximum improvement of 157 milliseconds over SIKE.

Hybrid Post-Quantum TLS Benchmark Details and Methodology

Hybrid Post-Quantum TLS Client

Figure 3: Architecture diagram of the AWS SDK Java Client using Java Native Interface (JNI) to communicate with the native AWS Common Runtime (CRT)

Our post-quantum TLS Client is using the aws-crt-dev-preview branch of the AWS SDK Java v2 Client, that has Java Native Interface Bindings to the AWS Common Runtime (AWS CRT) written in C. The AWS Common Runtime uses s2n for TLS negotiation on Linux platforms.

Our client was a single EC2 i3en.6xlarge host, using v0.5.1 of the AWS Common Runtime (AWS CRT) Java Bindings, with commit f3abfaba of s2n and used the x86-64 Assembly implementation for all SIKE-P434-R2 benchmarks.

Hybrid Post-Quantum TLS Server

Our server was a single EC2 i3en.12xlarge host running a REST-ful HTTP web service which used s2n to terminate TLS connections. In order to measure the latency of the SIKE-P434-R2 C implementation on these hosts, we used an s2n compile time flag to build a 2nd version of s2n with SIKE’s x86-64 assembly optimization disabled, and reran our benchmarks with that version.

We chose i3en.12xlarge as our host type because it is optimized for high IO usage, provides high levels of network bandwidth, has a high number of vCPU’s that is typical for many web service endpoints, and has a modern x86-64 CPU with the ADX and MULX instructions necessary to use the high performance Round 2 SIKE x86-64 assembly implementation. Additional TLS Handshake benchmarks performed on other modern types of EC2 hosts, such as the C5 family and M5 family of EC2 instances, also showed similar latency results to those generated on i3en family of EC2 instances.

Post-Quantum Algorithm Implementation Details

The implementations of the post-quantum algorithms used in these benchmarks can be found in the pq-crypto directory of the s2n GitHub Repository. Our Round 2 BIKE implementation uses portable optimized C code, and our Round 2 SIKE implementation uses an optimized implementation in x86-64 assembly when available, and falls back to a portable optimized C implementation otherwise.

Key Exchange Algorithm	s2n Client Cipher Preference	s2n Server Cipher Preference	Negotiated Cipher
ECDHE Only	ELBSecurityPolicy-TLS-1-1-2017-01	KMS-PQ-TLS-1-0-2020-02	ECDHE-RSA-AES256-GCM-SHA384
ECDHE + BIKE1‑CCA‑L1‑R2	KMS-PQ-TLS-1-0-2020-02	KMS-PQ-TLS-1-0-2020-02	ECDHE-BIKE-RSA-AES256-GCM-SHA384
ECDHE + SIKE‑P434‑R2	PQ-SIKE-TEST-TLS-1-0-2020-02	KMS-PQ-TLS-1-0-2020-02	ECDHE-SIKE-RSA-AES256-GCM-SHA384

Table 4 shows the Clients and Servers TLS Cipher Config name used in order to negotiate each Key Exchange Algorithm.

Hybrid Post-Quantum TLS Benchmark Methodology

Figure 4: Benchmarking Methodology Client/Server Architecture Diagram

Our Benchmarks were run with a single client host connecting to a single host running a HTTP web service in a different availability zone within the same AWS Region (us-east-1), through a TCP Load Balancer.

We chose to include varying numbers of HTTP requests in our latency benchmarks, rather than TLS Handshakes alone, because customers are unlikely to establish a secure TLS connection and let the connection sit idle performing no work. Customers use TLS connections in order to send and receive data securely, and HTTP web services are one of the most common types of data being secured by TLS. We also chose to place our EC2 server behind a TCP Load Balancer to more closely approximate how an HTTP web service would be deployed in a typical setup.

Latency was measured at the client in Java starting from before a TCP connection was established, until after the final HTTP Response was received, and includes all network transfer time. All connections used RSA Certificate Authentication with a 2048-bit key, and ECDHE Key Exchange used the secp256r1 curve. All latency values listed in Tables 1 above were calculated from the median value (50th percentile) from 60 minutes of continuous single-threaded measurements between the EC2 Client and Server.

More Info

If you’re interested to learn more about post-quantum cryptography check out the following links:

Conclusion

In this blog post, I gave a refresher on hybrid post-quantum TLS, I went over our hybrid post-quantum TLS 1.2 benchmark results, and went over our hybrid post-quantum benchmarking methodology. Our benchmark results found that BIKE and SIKE performed similarly when using s2n’s fastest available implementation on modern CPU’s, but that BIKE performed better than SIKE when both were using their generic C implementation.

If you have feedback about this blog post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.