AWS for Industries

Supporting digital signature and secure message transmissions for Brazilian Instant Payment Systems with AWS CloudHSM

Brazilian financial institutions such as banks, investment brokers, and insurance companies are authorized to use cloud services, as long as they comply with applicable legal and regulatory requirements.

AWS has a Compliance Center that offers financial institutions a central location to research cloud-related regulatory requirements and how they impact your industry. For example, the Resolution 4658 provides for the cyber security policy and the requirements for contracting services of data processing, data storage, and cloud computing to be observed by financial institutions and other institutions licensed by Brazilian Central Bank (BACEN).

BACEN is the leading monetary and regulatory authority in the banking sector, and Brazilian financial institutions may be subject to several different legal and regulatory requirements when using cloud services. In this link, you have a framework consisting of five key considerations that financial institutions should focus on to help streamline the whitelisting of cloud services for their most confidential data.

An important point to note is that when financial institutions migrate to or use AWS Cloud services, AWS is responsible for protecting the underlying infrastructure that supports the cloud, and financial institutions are responsible for everything that they deploy or connect to the cloud (Shared Responsibility Model).

Figure 1 – Shared Responsibility Model.

Instant payments are electronic money transfers in which the sending of the payment order and the availability of funds to the receiving user takes place in real time. Instant payments can be used for transfers between:

  • B2Bbusiness to business.
  • P2Pperson to person.
  • P2Bperson to business.
  • P2G person to government.
  • B2G business to government.
  • G2P – government to person.
  • G2Bgovernment to business.

BACEN establishes the provisions related to the modalities and criteria for participation in the Instant Payment System (SPI) and the criteria for Direct Access to Directory of Transactional Account Identifiers (DICT). The main security requirements are related to establishment of a mutual authentication TLS tunnel, in addition to sign digitally XML messages. AWS provides the necessary certifications and compliance standards for financial institutions to comply with the requirements of the SPI.

This blog post (part one of two) describes step by step, how financial institutions when connecting to an SPI can comply with the Brazilian Central Bank’s security guidelines, creating an architecture based on the cryptographic service named AWS CloudHSM.

Advantages for financial institutions

Here are some advantages for financial institutions when they create and configure an architecture to connect to the SPI, using the AWS CloudHSM service:

  • Keep encrypted data at rest and in transit.
  • Digitally sign messages and establish a TLS tunnel with mutual authentication.
  • Each HSM supports 1100 TPS(transactions per second), and it is possible to add 28 HSMs in the same AWS CloudHSM cluster.
  • To have scalability, availability, and native integration with other AWS services.
  • Be a service managed by AWS for storage and secure use of cryptographic keys, validated at FIPS 140-2 level 3.
  • Since most part of the architecture is serverless, there is a reduction in operational load linked to server management and security.

SPI technical security specifications

Security requirements are implemented to guarantee the integrity, confidentiality, and availability of the information being transferred. The defined security structure comprises all the protection mechanisms necessary to strengthen the defense systems against undesirable actions. They are divided into mandatory and recommended.

  1. All messages sent to BACEN must be authenticated using a cryptogram by the issuing financial institution (asymmetric keys required).
  2. Communication with to BACEN must be encrypted.
  3. The financial institutions are responsible for the physical and logical security of access to their private key.
  4. It is recommended that the financial institution store the private key in a specialized device for the management of cryptographic keys, in order to reduce the system’s exposure to failures and other types of vulnerabilities in the environment.
  5. The financial institution must connect to the API operations available on the SPI exclusively through the HTTP protocol version 1.1 using TLS encryption version 1.2 or higher, with mandatory mutual authentication in establishing the connection.

There are other guidelines recommended or mandatory by BACEN, for example, availability, redundancy, backup, and recovery of the environment. AWS can support financial institutions to support these guidelines in order to run critical workloads, programs, solutions, and services.

Transaction flow between direct participants

The service level agreement established by BACEN for the paying user experience is 10 seconds for 99% of transactions. View the example below the flow of transactions between direct participants. There are two types of message formats: SPI (ISO 20.022 standard) and DICT (which does not use the ISO 20.022 standard).

Figure 2 – Example of transaction flow between direct participants.

Main AWS services used in the proposed architecture

Application

AWS Fargate is a serverless computing engine for containers that works with Amazon Elastic Container Service (Amazon ECS) and with Amazon Elastic Kubernetes Service (Amazon EKS). The use of a serverless technology approach aims to focus on application development agility, reducing the operational overhead of server management, allowing the financial institution to focus on delivering value to the business.

AWS offers several tools to monitor Amazon ECS resources and respond to potential incidents, such as: Amazon CloudWatch, Amazon CloudWatch Logs, AWS CloudTrail, and AWS Trusted Advisor. If you want to know more about how to implement logging strategies on AWS, click here. Collect monitoring data from all parts of your AWS solution to make it easier to debug a multi-point failure (if it occurs).

Log registration (audit)

One of the requirements of BACEN is that all Brazilian financial institutions must store the record of the request and response for each transaction. Consistent recording of transactions is essential for auditing, in addition to solving and identifying problems.

The architecture demonstrated on this blog post is an event-driven architecture. Thus, we use Amazon Kinesis Data Firehose, which is a fully managed service for streaming data delivery in near-real-time, as events, and persist on Amazon S3.

Amazon Athena can be used to run interactive queries, on log files centralized on Amazon S3, with standard SQL. AWS Glue Data Catalog is used to provide an index for location metrics, layout, and execution time of your data. You use the information in the Data Catalog to create and monitor your ETL jobs. The information in the Data Catalog is stored as metadata tables, where each table specifies a single data store. Typically, you run a crawler to inventory the data, but there are other ways to add metadata tables to your Data Catalog.

Figure 3 – Example of ingestion pipeline: Data lake serverless with AWS Glue and Amazon Kinesis (streaming).

Cryptography and digital signature

AWS CloudHSM is a hardware security module-based cryptography service that financial institutions can use to store and use their keys. It provides a cluster of single-tenant FIPS 140-2 Level 3 validated HSMs (Hardware Security Module) under the exclusive control of the financial institution, attached directly to the customers Amazon (Amazon VPC). This service allows you to easily use HSMs with applications that run on Amazon EC2 instances and/or containers. With AWS CloudHSM, you can use standard VPC security controls to manage access to HSMs. Applications connect to HSMs using mutually authenticated SSL channels established by the HSM client software.

AWS CloudHSM can be used to support a variety of use cases, such as digital rights management (DRM), public key infrastructure (PKI), document signing, and cryptographic functions using PKCS# 11 interfaces, Java cryptography extensions (JCE), Microsoft CNG and OpenSSL. It also provides automated availability, replication, and backup of dedicated HSMs in Availability Zones.

Features of AWS CloudHSM:

  1. Dedicated access.
  2. Generation and use of encryption keys in HSMs validated by FIPS 140-2 level 3.
  3. BYOK (familiarity with the customer’s current key lifecycle).
  4. Configurable option to never extract the keys generated within the HSM boundary.
  5. High availability and durability with minimal configuration.
  6. Hourly based pay-as-you-go pricing for each HSM in use.
  7. You can quickly scale the HSM capacity by adding and removing HSMs from the cluster, on demand.
  8. Integrated with AWS CloudTrailin AWS API request log and Amazon CloudWatch for registration of user and key management actions.
  9. Managed service, where AWS manages the hardware maintenance aspects of the service and customers fully control the cryptographic aspects of the service.

Architecture

The architecture presented in this blog post can be part of a more complete solution, based on events, which can contemplate the entire payment message transmission flow from the banking core.

The architecture presented in this blog post can be part of a more complete solution, based on events, which can contemplate the entire payment message transmission flow from the banking core. For example, the complete solution of the financial institution (paying or receiving) could contain other complementary architectures such as Authorization, Undo (based on the SAGA model, Effectiveness, Communication with on-premises environment (hybrid environment), etc., using other services such as:

In the diagram of our architecture, the green box represents the communication proxy with BACEN, either in synchronous or asynchronous mode, considering the services AWS CloudHSM, AWS Fargate, and Elastic Load Balancing (ELB).

The idea of the proxy is to be a direct and mandatory path for every transaction, with the following objectives:

  1. Signature of XML messages.
  2. Establishment of the TLS tunnel with mutual authentication (mTLS).
  3. Sending the request log to the data stream.

Optionally, the financial institution can view the transactions that this solution processes using Amazon QuickSight. The Amazon QuickSight’s serverless architecture allows you to provide insights to everyone in your organization. You can share interactive and sophisticated dashboards with all your users, allowing them to do detailed searches and explore data to answer questions and gain relevant insights.

If you have a separate security account, use this procedure or other, to share an AWS CloudHSM cluster with the other AWS account.

Proxy

Look to the diagram of the example architecture and follow step-by-step of the secure message transmission flow to BACEN and the respective response:

Figure 4 – Example architecture.

  1. Store login and password in AWS Secrets Manager, to communicate with AWS CloudHSM.
  2. Create or import the private key on AWS CloudHSM.
  3. Store the three certificates in the AWS Systems Manager Parameter Store: (1) generated key certificate for signature, (2) certificate generated for mTLS and (3) certificate for AWS CloudHSM (customer’s CA).
  4. The Service/Application sends a transaction request in XML format.
  5. Elastic Load Balancing balances load across AWS Fargate containers.
  6. Application running on AWS Fargate uses AWS CloudHSM for digital signature of XML.
  7. Application running on AWS Fargate uses AWS CloudHSM to establish mutual TLS authentication and transmit signed XML messages to BACEN.
  8. Application running on AWS Fargate receives the response from BACEN and, if necessary, validates the digital signature of the received XML.
  9. Application (AWS Fargate) records the request log by sending it directly to Amazon Kinesis Data Firehose.
  10. The reply message is sent back via the Elastic Load Balancing.
  11. The reply message is received by the Service/Application.
  12. Amazon Kinesis Data Firehose uses the AWS Glue Data Catalog to convert the logs to parquet format.
  13. Amazon Kinesis Data Firehose sends the logs, to Amazon S3, already partitioned into “folders” (/year/month/day/hour/).
  14. Amazon Athena uses the AWS Glue Data Catalog as a central location to store and retrieve table metadata.
  15. AWS Glue crawlers automatically update new partitions in the metadata repository, every hour.
  16. You can immediately query the data directly on Amazon S3 using serverless analysis services, such as Amazon Athena (ad hoc with standard SQL) and Amazon QuickSight.

AWS CloudHSM costs and performance

The following items should be considered for production workloads in order to optimize cost:

  • Leverage elasticity: scale the cluster up / down as the workload varies. More information on monitoring AWS CloudHSM and performance can be found here.
  • Maximize utilization: share a rarely used cluster of HSMs between accounts to increase cluster utilization
  • Optimize key storage: if possible, to group keys

Now, let’s use the AWS calculator to estimate the monthly cost of TPS (transaction per second), in the São Paulo Region, for AWS CloudHSM. Remembering that there are no upfront costs to use AWS CloudHSM, and you pay an hourly fee for each HSM started until you close it. Each HSM supports 1,100 TPS (2048-bit RSA signature/verification operations with asymmetric keys).

Total hourly rate per HSM = US $ 2.72.

Initially, we can consider one AWS CloudHSM cluster with 2x HSMs, one in each Availability Zone, and remember that it is possible to instantiate up to 28 HSMs per AWS CloudHSM cluster.

2x 1100 = 2200 TPS for operations with asymmetric keys.
2 HSMs x 730 hours in 1 month x US 2.72 = US $ 3,971.20 / month.

Figure 5 – Simulation with AWS CloudHSM pricing calculator (2x HSMs).

Conclusion

Using AWS services, financial institutions have the control and confidence necessary to run their business safely in the most flexible and secure cloud computing environment available.

BACEN’s security requirements regulate the Brazilian banking system. Thus, with AWS services, including the AWS CloudHSM encryption service, the financial institution can improve its ability to meet key security and compliance requirements, such as location, data protection, and confidentiality. AWS allows you to automate manual security tasks so that you can focus on scaling and innovating your business. In addition, financial institutions pay only for the services they use.

The example architecture presented includes the security part for digital signature and transmission (mTLS) of messages, and also includes the mandatory logging of requests.

Artifacts

You can find many code and implementation examples in the official AWS repository (aws-samples). To facilitate understanding and implementation of our solution, we share the source code.

  • All the features of implementing the proxy with AWS CloudHSM, mentioned in this blog post are available in the official AWS repository (pix-proxy-samples). You can clone, change, run it, but it should not be used as a basis for building the final integration of the financial institution with BACEN (SPI and DICT).
  • All the information needed to deploy and test the example can be found directly in the README of the repository.

For the development and testing of the solution, we developed a simulator to represent the BACEN environment. In this simulator, we require client authentication (proxy) to provide the requirement of mTLS. In addition, we perform the digital signature validation of the requests received, and digitally sign the response to be sent to the proxy.

On average, Get Claims requests get a response time of 65 ms (depends on the amount of computational resources in the container and size of the response message), measured from the Service/Application. Look the flow:

  • Service / Application → proxy (mTLS) → SIMULATOR (signs response) (mTLS) → proxy (validates signature of response) (stores request log) → service / application.

On average, Insert Claim requests obtain a response time of 109ms (depends on the amount of computational resources in the container and size of request and response messages), measured from the Service / Application. Look the flow:

  • Service / application → proxy (sign the request) (mTLS) → SIMULATOR (validate the signature of the request) (sign the response) (mTLS) → proxy (valid signature of the answer) (store the request log) → service / application.

Future resources

In addition to the financial institutions, AWS solutions architects also have the DNA of a builder. So, soon, we will publish another blog post with architecture (part two of two), using another encryption service called AWS KMS.

Further Reading

  1. How to run AWS CloudHSM workloads on AWS Lambda
  2. How to deploy AWS CloudHSM to securely share your keys with your SaaS provider
  3. Liveness detection for authorization of payments
  4. How to approve AWS services for highly confidential data in financial institutions