How AWS KMS and AWS Encryption SDK overcome symmetric encryption bounds

If you run high-scale applications that encrypt large volumes of data, you might be concerned about tracking encryption limits and rotating keys. This post explains how AWS Key Management Service (AWS KMS) and the AWS Encryption SDK handle Advanced Encryption Standard in Galois Counter Mode’s (AES-GCM) encryption limits or bounds automatically by using derived key methods so you don’t have to. These methods generate a new derived key K_d from the main key K by using a random nonce. That way, encryption is done with a unique key each time, and K can be used for much longer. Similar derived key modes have been proposed in various schemes recently like (KC-)XAES, DNDK v2, and ia.cr/2020/1153.

Symmetric encryption bounds

Symmetric encryption algorithms encrypt large amounts of data in transit and at rest. Modern ciphers also authenticate data using an authentication tag — these are called Authenticated Encryption with Additional Data (AEAD) ciphers. Examples of AEAD ciphers include AES-GCM and ChaCha20/Poly1305.

AES-GCM is the most widely used encryption algorithm and was standardized by NIST in SP 800-38D. AES-GCM uses a 128- or 256-bit key K and a (usually 96-bit) initialization vector (IV) to encrypt and authenticate a plaintext P. It also authenticates additional authenticated data (AAD). The output is a ciphertext C and an authentication tag T:
(C, T) = AES-GCM(K, IV, AAD, P)

At decryption, the recipient decrypts C and verifies the tag T by using K, IV and AAD and produces the original plaintext P (assuming the tag was authenticated successfully).

Encryption invocation limits

When encrypting data, it’s critical that the K, IV tuple does not repeat for the life of the key K. Otherwise, the security properties of AES-GCM are lost. SP 800-38D requires an implementation to have a probability of key and IV reuse less than one in 4.29 billion (<2^-32). This can be achieved by using a deterministic IV that doesn’t repeat or a random IV. If a random IV is used, then it is necessary to rekey after 2^-32 encryptions. For example, common protocols like TLS or IKEv2/IPsec prevent (K, IV) collisions by using deterministic (that is, starting from a random value and incrementing) IVs per connection.

Data bounds

Assuming the probability of an (K, IV) collision is statistically insignificant (<2^-32), there are still data bounds when encrypting large amounts of plaintexts with the same key K. The block counter in AES-GCM is 32-bits, which leads to a limit of 2³²-2 bytes (68.72 GB) per encryption operation (per (K, IV) pair). Additionally, a failure to restrict the total amount of data reduces the security guarantees an adversary can distinguish between two different plaintexts, that is knowing which of two messages are encrypted in the ciphertext. The higher protection of indistinguishability, the lower the total number of bytes you can encrypt. NIST’s specification, SP 800-38D, suggests a limit of 2⁶⁸ bytes protected under a single key K which provides an indistinguishability probability of 50%. More conservative security margins are sometimes used, based on different analyses (ia.cr/2024/051, 10.1145/3243734.3243816). AWS sets a more conservative margin too, enforcing a negligible indistinguishability probability (<2^-32) by default.

Once you reach the AES-GCM data bounds for a given security margin, you need to rotate the symmetric key. Such limits (for example, 2³² encryptions per key with random IVs, or encrypting the maximum total data per key) could be reached in modern, high-scale encryption use cases. Tracking these limits across distributed systems with many concurrent sessions adds operational complexity. We have shared these challenges with using AES-GCM at the scale of AWS in a writeup and a presentation in NIST’s third NIST Workshop on Block Cipher Modes of Operation in 2023.

How AWS KMS uses derived keys

AWS KMS is a managed service that you can use to create and control the keys used to encrypt and sign data. The AWS KMS Encrypt API supports symmetric and asymmetric encryption. For symmetric key encryption, AWS KMS uses AES-GCM with 256-bit keys to encrypt a plaintext up to 4 KB in size. The AWS KMS request includes the plaintext, and the symmetric key identifier (KeyId) of the symmetric customer managed key (CMK) stored in KMS.

A symmetric key Encrypt API call to AWS KMS uses the CMK to derive a symmetric encryption key before encrypting the plaintext. AWS KMS generates a random 128-bit nonce N and produces a 256-bit symmetric key from the main key K specified in the KeyId by using a key derivation function (KDF). A KDF takes in a key, a label and context, an invocation-specific nonce N, and an output length L_Km in bytes, and produces key material of that length as K_mat = KDF(K, <label>, <context>, N, L_Km). <label> is usually an application- or invocation-specific value. <context> includes invocation-specific input. For AWS KMS, the KDF function is a NIST SP 800-108r1 Counter Mode KDF producing 256 bits of keying material with HMAC-SHA256 as the pseudorandom function. K_d is essentially produced with one call to HMAC-SHA256 with key K as:
K_d = HMAC-SHA256(K, <ctx>),
where <ctx> consists of a counter value concatenated with constants and N.

Subsequently, AWS KMS generates a 96-bit random IV and encrypts the input plaintext input P with AES-GCM as (C, T) = AES-GCM(K_d, IV, AAD, P).

AWS KMS returns a CiphertextBlob that includes the IV, nonce N, ciphertext and tag (C,T) so that the CiphertextBlob can be decrypted on subsequent calls to the Decrypt API.

Intuitively, the 128-bit random nonce used to derive a per encryption key under a CMK ensures that a caller can go way over the 2³² limit on the number of encryptions they can make under the CMK. Furthermore, the limit of 4 KB on the payload size for an AWS Encrypt call ensures the total amount of data encrypted under an encryption key stays well below NIST or other more conservative total encryption bounds. For more details and the mathematics of the security underpinnings of this scheme, see Key Management Systems at the Cloud Scale.

How AWS Encryption SDK applies derived key modes per invocation

The AWS Encryption SDK is a client-side encryption library used for encrypting and decrypting data. It can be configured to use data key caching to reduce API calls when encrypting multiple payloads. Using a nonce-based derived key for each AES-GCM encryption invocation eliminates the need for customers to keep track of the total amount of data they encrypt under a single data key.

Although the AWS Encryption SDK provides a lot of flexibility to accommodate many encryption scenarios, the default configuration handles key derivation and frame sizing automatically, so you don’t need to tune these settings for most use cases. To derive a different key per invocation, like AWS KMS, it uses a randomly generated value, N, the main key K, and some invocation-specific context in the KDF. N is 256 bits in the default configuration. The underlying KDF is the HMAC-based Extract-and-Expand Key Derivation Function (HKDF) with SHA512 as the default hash. K_d is essentially produced with one HKDF call with key K as:
K_d = HKDF(K, salt=<lbl>, info=<ctx>, 32),
where <lbl> is a constant and <ctx> consists of constants concatenated with a random 256-bit value in the default configuration.

Subsequently, the AWS Encryption SDK uses the derived key K_d to encrypt user content, broken into 4-KB frames by default. Each frame plaintext Pf is encrypted with AES-GCM with a deterministic IV as (C, T) = AES-GCM(Kd, IV, AAD, Pf).

The 96-bit deterministic IV consists of the frame counter frameID, where frameID<2³². The additional authenticated data AAD is specific to the Encryption SDK data frame. At decryption, the recipient derives K_d from K in the same way and decrypts the ciphertext C to produce the frame plaintext Pf and validates the authentication tag T.

The 4 KB frame size ensures that by default no more than 2⁴⁴ bytes (2³² frames of 4 K bytes each) of data can be encrypted under a single encryption key. This is well below the NIST suggested bound (2⁶⁸), even with data key caching. It is also well below our conservative requirement of <2^-32 indistinguishability probability. The limit of invocations per key, even with data key caching, exceeds the encryption counts in most high-scale applications.

Note: While the AWS Encryption SDK makes conservative choices in its default configuration, if you’re using legacy version 1.0 or making configuration changes, you might have lower security guarantees. For example, a custom maximized frame size of 2³²-1 bytes would lead to larger total plaintext size which is still below the 2⁶⁸ NIST suggested limit, but not below other conservative bounds.

Note that the default AWS Encryption SDK configuration also provides lesser-known security properties, like key commitment. The commitment string is produced similarly to the derived key, by using K and HKDF.

Conclusion

By deriving a unique key per encryption call, AWS KMS and the AWS Encryption SDK eliminate the need to manually track AES-GCM limits.

For the academic basis for AES-GCM’s bounds, see SP 800-38D and draft-irtf-cfrg-aead-limits. To read more on the cryptographic analysis of the key derivation scheme used in KMS, see Key Management Systems at the Cloud Scale. For more details on the Encryption SDK AES-GCM key derivation, see the AWS Encryption SDK algorithms reference.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security, Identity, & Compliance re:Post or contact AWS Support.

AWS Security Blog

How AWS KMS and AWS Encryption SDK overcome symmetric encryption bounds

Symmetric encryption bounds

Encryption invocation limits

Data bounds

How AWS KMS uses derived keys

How AWS Encryption SDK applies derived key modes per invocation

Conclusion

Resources

Follow

Learn

Resources

Developers

Help