AWS Public Sector Blog

How to build an Aadhaar Data Vault on AWS

An Aadhaar number is a 12-digit unique identification number issued by the Unique Identification Authority of India (UIDAI) to every individual in India. An Aadhaar number can be used to support various government subsidies and acts as a vital proof of identity and proof of address for opening a fixed deposit account, applying for a passport, investing in mutual funds, and more. For example, the Government of India has required linking a permanent account number (PAN) with an Aadhaar number to file an Income tax return.

Considering the sensitivity of the Aadhaar number and the potential implication of having one’s Aadhaar number compromised, UIDAI mandated the need for all Aadhaar and Aadhaar-related data to be encrypted and stored separately in a secure, access-controlled data repository known as an Aadhaar Data Vault. Encryption is the process of using an algorithm to transform plaintext into ciphertext. An algorithm and an encryption key are required to decrypt the original plaintext. UIDAI also mandated that the hardware security module (HSM) used to store the keys for encryption in the Aadhaar Data Vault cannot be shared with any other agency or legal entity. An HSM is a physical computing device that safeguards and manages digital keys, performs encryption and decryption functions for digital signaturesstrong authentication, and other cryptographic functions.

This blog post explains how government and private entities that collect, process, and store Aadhaar data for various use cases can use AWS CloudHSM from Amazon Web Services (AWS) to create a secure Aadhaar data storage solution that can meet guidelines provided by UIDAI.

Note: This is only a high-level architecture with the recommendation to segregate the Aadhaar Data Vault and consumer applications in separate AWS accounts. Customers are advised to review the complete security requirements and implement appropriate controls as per the security guidelines.

What is an Aadhaar Data Vault?

An Aadhaar Data Vault is a secure, access-controlled centralized storage repository for all the Aadhaar numbers collected by requesting entities, like an Authentication User Agency (AUA), Know-Your-Customer User Agency (KUA), or any other agency for specific purposes under the Aadhaar Act and Regulations published in 2016. It is a secure system inside the respective agency’s infrastructure accessible only on a need-to-know basis. As mandated by UIDAI, an organization must store Aadhaar numbers in an encrypted database within an Aadhaar Data Vault.

How to set up an Aadhaar Data Vault using AWS CloudHSM

Prerequisites:

  • Basic understanding of following AWS services:

Amazon Elastic Compute Cloud (Amazon EC2), Amazon Virtual Private Cloud (Amazon VPC), AWS CloudFormation, AWS CloudHSM, AWS Key Management Service (AWS KMS), Amazon API Gateway, AWS Identity and Access Management (IAM), Amazon Simple Storage Service (Amazon S3), and Amazon Relational Database Service (Amazon RDS).

  • Two separate AWS accounts with administrator access for each.

AWS architecture for implementing an Aadhaar Data Vault using CloudHSM

Now that we’ve covered the basics of Aadhaar Data Vault and prerequisites, let’s examine the architecture for the solution on AWS.

Figure 1. Architecture diagram of the solution described in this blog. The major components are an AWS CloudHSM cluster, Amazon RDS instance, Amazon EC2 instance, Amazon API Gateway, and AWS KMS.

Figure 1. Architecture diagram of the solution described in this blog. The major components are an AWS CloudHSM cluster, Amazon RDS instance, Amazon EC2 instance, Amazon API Gateway, and AWS KMS.

AWS CloudHSM is a cloud-based HSM that enables you to generate and use your own encryption keys on the AWS Cloud. With CloudHSM, you can manage your own encryption keys using Federal Information Processing Standard (FIPS) Publication 140-2 Level 3 compliant HSM. FIPS is a global security standard that specifies security requirements for cryptographic modules that protect sensitive information.

This solution makes use of the CloudHSM managed service to create a cluster of two nodes, and enables the creation and maintenance of the encryption keys via the AWS KMS custom key store. Users can use the keys to encrypt the Aadhaar data hosted in Amazon RDS-based databases.

Create a VPC for hosting the Aadhaar Data Vault

1. Log into the AWS account (‘AUA AWS Account A’ in architecture diagram) via the AWS Management Console as an administrator.

2. Create a VPC with the public subnets, two private subnets, and NAT Gateway- one per Availability Zone (AZ) and Amazon S3 gateway endpoint.

This will create a VPC in your AWS account for hosting the Aadhaar Data Vault infrastructure.

Create HSM cluster with AWS KMS custom key store

1. Navigate to the GitHub repo for the Automated deployment of AWS CloudHSM resources using AWS CloudFormation template.

2. Choose yaml, then choose “Raw.” Open the context (right-click) menu and then choose Save as. Save the file on your local machine as “cloudhsm. yaml”

3. Open the AWS console, and create a CloudFormation stack using the saved file.

a. Choose the VPC created in earlier step.

b. Choose a private subnet for Amazon EC2 Client Instance Configuration

Figure 2. CloudFormation stack is successfully created. This creates AWS CloudHSM cluster connected to a custom key store.

Figure 2. CloudFormation stack is successfully created. This creates AWS CloudHSM cluster connected to a custom key store.

4. Once the stack creation process is completed, you can check the HSM cluster by opening the AWS CloudHSM service page in the AWS console.

Figure 3. CloudHSM cluster configuration

Figure 3. CloudHSM cluster configuration.

5. You can check the custom key store by opening the AWS KMS page in the AWS console. Check the HSM cluster ID; the status should say CONNECTED.

Figure 4. AWS KMS custom key store connected with CloudHSM cluster.

Figure 4. AWS KMS custom key store connected with CloudHSM cluster.

6. Open the Amazon EC2 console. You should see the CloudHSM management instance running.

Figure 5. CloudHSM management instance running in private subnet.

Figure 5. CloudHSM management instance running in private subnet.

Change the crypto officer password

As a security best practice, you should change the crypto officer (CO) password immediately after the stack is created.

1. Open AWS Secrets Manager page in AWS console.

2. Choose initial crypto officer password for HSM cluster.

3. Choose Retrieve secret value and copy the password.

4. Open the Amazon EC2 console and connect to the Amazon EC2 instance using AWS Systems Manager Session Manager.

5. In terminal window, type the below command to connect to CloudHSM Management Utility (CMU):

/opt/cloudhsm/bin/cloudhsm_mgmt_util/opt/cloudhsm/etc/cloudhsm_mgmt_util.cfg

The CMU connects to both nodes of the HSM cluster identified by two separate IP addresses. This enables changes to be propagated simultaneously across the two nodes.

Figure 6. The CMU connects to both HSM nodes using private IP addresses.

Figure 6. The CMU connects to both HSM nodes using private IP addresses.

6. Type the below command to verify users in each HSM node:

listUsers

Figure 7. Users in each HSM nodes are listed.

Figure 7. Users in each HSM nodes are listed.

7. Type the below command to enter the Crypto Officer (CO) password retrieved from AWS Secrets Manager:

loginHSM CO admin -hpswd

Enter a password at the prompt and press Enter.

Figure 8. Login using existing CO user password is successful.

 Figure 8. Login using existing CO user password is successful.

8. Type the below command to change the password of the CO user:

changePswd CO admin -hpswd

Enter the new password twice at the prompt and press Enter.

Figure 9. CO user password change is successful on both HSM nodes.

Figure 9. CO user password change is successful on both HSM nodes.

9. To disconnect from the CloudHSM Management Utility, type quit and press Enter.

Figure 10. Successful disconnection from CMU.

It is strongly recommended that you store the new password in your standard enterprise password vault. At this stage, you can optionally delete the secret from AWS Secrets Manager given that the initial password is no longer needed for operation of the cluster. Since you have created a custom key store, AWS KMS has already changed the initial password for the ‘kmsuser’ across the HSMs. No action is needed from the user.

Create an encryption key using custom key store

1. Open the Key Management Service page in the AWS Console and choose Customer-managed keys.

2. Choose Create key. For Key Type, choose Symmetric and for Key Usage, choose Encrypt and decrypt.

3. Expand Advanced option and for key material origin, choose custom key store (CloudHSM). Choose Next.

4. Choose custom key store created in earlier step and choose Next.

5. Enter Key Alias and choose Next.

6. Choose the IAM users and roles who can administer this key through the KMS API. Choose Next.

7. Select the IAM users and roles that can use the KMS key in cryptographic operations. Choose Next.

8. Review and choose Finish. You should see a success notification.

Figure 11. AWS KMS key generated using custom key store and CloudHSM cluster.

Figure 11. AWS KMS key generated using custom key store and CloudHSM cluster.

Create encrypted multi-AZ Amazon RDS instance

1. Create database (DB) subnet group consisting of the private subnets created in VPC at the beginning.

2. Create a DB instance:

a. Choose the VPC hosting the Aadhaar Data Vault. Choose the DB subnet group created earlier.

b. Choose Multi-AZ

c. Choose the KMS key created in the previous step for encrypting the database.

Create master data management Amazon EC2 instance

1. Open the Amazon EC2 console and create a new Amazon EC2 instance (Windows/Linux) in a private subnet of the VPC created earlier. This instance should not have a public IP address and the administrator should set up connectivity via AWS Systems Manager Session Manager which allows you to connect to your Amazon EC2 instance securely without RDP/SSH ports, bastion hosts, or public IP.

2. Configure the inbound rules in security groups of the Amazon RDS database instance and Amazon EC2 instance to allow traffic on the specific port used by the Amazon RDS instance.

3. Once logged in, the administrator can install the database management system (DBMS) client software on the Amazon EC2 instance and connect using the administrator credentials configured during database creation.

4. Only the Aadhaar Vault Administrator user should log into the management Amazon EC2 instance. The administrator credentials should be securely stored in an enterprise vault. The AWS Shared Responsibility Model should be implemented for security and compliance.

Create tokenization solution to protect Aadhaar Data using reference keys

 Tokenization is the process of transforming a piece of data into a random string of characters called a token or reference key. It does not have direct meaningful value in relation to the original data. Tokens serve as a reference to the original data, but cannot be used to derive that data.

The Amazon RDS database created earlier can be used to store relationship data between the sensitive value—the Aadhaar number—and the corresponding token. The real data in the vault is secured via encryption key created in earlier step. The token value can be used in various consumer applications as a substitute for the original Aadhaar data.

The tokenization layer can consist of client side encryption using AWS Encryption SDK and AWS Lambda, or other application logic which can generate a unique random token (reference key) based on universally unique identifier (UUID) scheme for each of the Aadhaar numbers stored in the Aadhaar Vault database. Only the reference keys are stored in the application databases and their relationship with the real data is stored in the Aadhaar Data Vault. This makes sure that the recovery of the original Aadhaar number is not computationally feasible knowing only the reference key or number of reference keys. AWS Encryption SDK uses the AWS KMS encryption key to encrypt the sensitive data before it is written in the Amazon RDS database.

The various consumer applications can communicate with the Aadhaar Data Vault tokenization layer via private REST APIs created using Amazon API Gateway. The APIs can only be accessed by using an interface VPC endpoint. This makes sure that the traffic between consumer applications and Amazon API Gateway does not traverse the internet. Using resource policies, you can allow or deny access to your API from selected VPCs and VPC endpoints, including across AWS accounts. Each endpoint can be used to access multiple private APIs. You can configure throttling and quotas for your APIs to help protect them from being overwhelmed by too many requests.

For additional information on tokenization, see the AWS blog post Building a serverless tokenization solution to mask sensitive data and How to use tokenization to improve data security and reduce audit scope.

For additional security, customers can make use of AWS Nitro Enclaves which is an Amazon EC2 capability that allows you to create isolated compute environments from Amazon EC2 instances.

This completes creation of the ‘Aadhaar Data vault’ infrastructure in an AWS account provisioned for the Aadhaar Vault (‘AUA AWS Account A’ in the architecture diagram).

Create consumer VPC in separate AWS account

1. Log into the AWS account (‘AUA AWS Account B’ in architecture diagram) as an administrator.

2. Users can follow same instructions from an earlier step to create a VPC for hosting consumer applications. The classless inter-domain routing (CIDR) blocks of the two VPCs must be different. For example, if the Aadhaar Data Vault VPC CIDR is 10.0.0.0/16, you can choose 172.31.0.0/16 for the consumer VPC.

3. Once the VPC is created, you can create Amazon EC2 instances and deploy the consumer applications and databases as per the requirements.

Create VPC peering connection and configure security

1. To create a VPC peering connection, first create a peering request from Consumer VPC in AUA AWS Account B to peer with Aadhaar Vault VPC in the AUA AWS Account A.

2. Accept the peering connection from AUA AWS Account A.

3. Configure the route tables, network access control lists (NACLs) and security groups to lock down network traffic from specific IP addresses in the consumer VPC to the VPC endpoint in Aadhaar Vault VPC.

Conclusion

In this blog, we guided you through the process to set up Aadhaar Data Vault on AWS using CloudHSM. Once deployed, various consumer applications can access the Aadhaar data based on fine grained access control implemented using IAM, tokenization, private APIs, and client side encryption. AWS CloudTrail  monitors and records account activity across your AWS infrastructure, giving you control over storage, analysis, and remediation actions. Agencies can use an Amazon S3 bucket to store scanned documents and images, and enforce access control using VPC gateway endpoints and bucket policies. This enables users to build a highly available, secure, and scalable solution to meet increasing demand for Aadhaar data processing. Using CloudFormation can save time and effort to implement the solution, and organizations can benefit from cost optimization with AWS’s pay-as-you-go model. Plus, organizations can quickly get started with this cloud-based solution without having deep expertise in HSM devices.

Learn more about how AWS supports the public sector in India.

Read more about AWS for the public sector in India:


Subscribe to the AWS Public Sector Blog newsletter to get the latest in AWS tools, solutions, and innovations from the public sector delivered to your inbox, or contact us.

Please take a few minutes to share insights regarding your experience with the AWS Public Sector Blog in this survey, and we’ll use feedback from the survey to create more content aligned with the preferences of our readers.

Pankaj Patil

Pankaj Patil

Pankaj Patil is a solutions architect working with worldwide public sector in India at Amazon Web Services (AWS). He has over 16 years of experience in IT industry. He is passionate about geospatial technology, cloud adoption, and building highly available, secure, and scalable solutions on AWS on behalf of partners and customers.

Mandar Patil

Mandar Patil

Mandar Patil is solutions architect leader with worldwide public sector in India at Amazon Web Services (AWS). He has over 20 years of experience in the IT industry.

Vikas Tiwari

Vikas Tiwari

Vikas Tiwari is solutions architect leader with worldwide public sector in India at Amazon Web Services (AWS). He has over 20 years of experience in the IT industry.