Enforcing immutability, traceability and transparency in clinical trials using Amazon Managed Blockchain

Amazon Managed Blockchain is a fully managed service that makes it easy to create and manage scalable blockchain networks using popular open source frameworks, such as Hyperledger Fabric and Ethereum.

General availability of Hyperledger Fabric was announced in April 2019, with support for Ethereum coming soon. For additional details about Managed Blockchain, see What Is Amazon Managed Blockchain?

In this post, you will learn how to use Amazon Managed Blockchain in a life sciences clinical trial setting. We will briefly describe the problem it solves, explain why blockchain is a good solution, provide a reference architecture, and describe its use by providing example use cases.

What problem are we solving?

The clinical trial ecosystem is highly complex as it involves multiple parties and stakeholders using highly siloed applications. Some of the example stakeholders are: sponsors, investigators, patients, Institutional Review Board (IRBs), regulatory agencies, registries, statisticians, drug supply, data managers, and trial monitors to name a few.

These stakeholders touch various parts of the clinical trial process from trial protocol setup, patient enrollment, data collection, trial monitoring, data management, data analysis, study reports generation, and submission of reports. The following is a list of major potential issues with this siloed system that may result in higher costs, and at worst, failed clinical trials:

The patient privacy, safety, and consent management.
Error-prone or incomplete transactions and records.
Reproducibility, integrity, and trustworthiness of data during data sharing.

Why blockchain?

Blockchain makes it possible to build applications where multiple parties can execute transactions without the need for a trusted central authority. The core principle of blockchain is that any service relying on trusted third parties can be built in a transparent, decentralized, secure “trustless” manner at the top of the blockchain.

Moreover, the trust is encoded in the blockchain protocol via a complex cryptographic algorithm. This technology has the potential to provide immutability and data integrity while removing the need for a central intermediary for managing the transactions and the infrastructure.

However, building a scalable blockchain network with existing technologies today is complex to set up and hard to manage. To create a blockchain network, each network member must manually provision hardware, install software, create and manage certificates for access control, and configure networking components. Once the blockchain network is running, you must continuously monitor the infrastructure and adapt to changes, such as an increase in transaction requests, or new members joining or leaving the network.

Why Amazon Managed Blockchain?

Amazon Managed Blockchain is a fully managed service that allows you to set up and manage a scalable blockchain network with just a few clicks.

Amazon Managed Blockchain eliminates the overhead required to create the network, and automatically scales to meet the demands of thousands of applications running millions of transactions. Once your network is up and running, Managed Blockchain makes it easy to manage and maintain your blockchain network. It manages your certificates and lets you easily invite new members to join the network.

A typical workflow in Amazon Managed Blockchain is pictured below:

More details on the Amazon Managed Blockchain-based Hyperledger Fabric framework are outlined below.

Managed Blockchain is fully managed, which means that all of the Hyperledger Fabric components are managed for you. A blockchain network has multiple types of nodes, a smart contract or chaincode, and a ledger containing a state database and a log of transactions. The nodes can be categorized into:

Peer nodes that maintain and update the ledger.
Orderer nodes that maintain the order of transactions, create a block, and update the peer nodes.
Client nodes that invoke transactions.

General clinical trials workflow

Clinical trials involve many different stakeholders. A clinical trial is initiated by a sponsor – a pharmaceutical company, biotech or med-device, academic institution, a research principal investigator, or a government body that aims to introduce a new medical treatment to patients.

The clinical research organizations (CROs) are sometimes chosen to conduct trial-related tasks and duties, such as data monitoring, site selection, patient recruitment, regulatory affairs, and overall project management of the trial. The clinical trial can be initiated, but only after explicit approval from an independent ethics committee and national competent authority (CA) in the country where the trial is to be conducted.

After approval, the bulk of the actual trial activities take place at the trial sites, and consist of patient visits and treatment administration (for interventional clinical trials). Throughout a trial, all of these stakeholders must interact and collaborate in a broad range of trial-related activities, such as study approval, site monitoring, data management, safety reporting, regulatory filing, medical writing, and research dissemination.

Patient recruitment is a critical step for the success of any clinical trial. In general, a clinical study may only be conducted if the study participants have given their written informed consent to participate. The study participant has a right to recall their consent at any time with immediate effect. Many patients are wary of signing consent because of the fear of data misuse, unauthorized data sharing, data storage, and security around the data. Additionally, the patient identity must be established for use during the clinical trial process.

This has important implications in execution of the study as well as the data analysis used for submission. When a patient is being recruited for a trial, we must track the who, what, where, and why of the recruitment. We must also track patient identity across the entire trial execution.

All of this could be managed in a distributed ledger – the goal is to make sure we replicate the data in a reliable, safe, and usable manner.

The solution consists of various stakeholders (such as a sponsor, CRO, and hospital/site) participating in a network such that the patient recruitment takes place in a distributed manner with programmable smart contracts based on the clinical trial protocol. The patient consent is the smart contract. All stakeholders run the smart contract that updates their local version of the ledger and the organization nodes remain in sync.

Figure 1: Typical Amazon Managed Blockchain architecture for patient consent

Architecture description

In this diagram (Figure 1), the top section is a Hyperledger Fabric Network managed by Amazon Managed Blockchain.

Although the blockchain network has many components, like orderer, ledger (world-state + transaction-log), and smart contracts, you will not see these components in your AWS account as they are managed by AWS. The Hyperledger Fabric CA is a Certificate Authority (CA) for Hyperledger Fabric.

Once you are set up, what you will see in your account is the bottom section of this diagram. You will see your own virtual private cloud, an Amazon Virtual Private Cloud (Amazon VPC) endpoint that points to the components that are managed by Amazon Managed Blockchain. This VPC endpoint uses AWS PrivateLink, which means that all the traffic over this network connection goes via a private connection on the Amazon backbone. The traffic is protected and encrypted, and it does not go over the public internet.

Also in the VPC, you must have a client node, which is an EC2 instance configured to communicate with your blockchain network. All the communication to the network is via the client node.

As an example, one member of your blockchain network could be a clinical trial sponsor institution collecting the patient consent and data. Another member could be a CRO organization that monitors and manages the data. The research organization could get the data according to the consent information, which can be pulled in from the ledger database via the smart contract. The smart contract is running in the peer node and can be communicated from the client node. It enables fine-grained access checks to verify the authenticity of proposed transactions. Unlike a regular database, the ledger can only be accessed or updated through the functionalities defined in the smart contract.

Key components of our solution

A blockchain network is a peer-to-peer network running a decentralized blockchain framework.

Members: These refer to the participants such as sponsors, CRO, and clinical trial site that are on the network. The creator is the first member of the network (sponsor). Other than the creator, other members must be added to the Managed Blockchain network via a proposal and voting process. Note that when “member” is mentioned in Managed Blockchain, it is referring to Hyperledger Fabric organization.

Ledger: Blockchain networks contain a distributed, cryptographically secure ledger that maintains the history of transactions in the network that is immutable – it can’t be changed after the fact. Two types of data are stored in the ledger: the chained blocks of transactions, and a key value store database that houses the world state.

Peer nodes: When a member joins the network, one of the first things they must do is create at least one peer node in the membership. Each peer node hosts ledgers and smart contracts. The peer nodes also interact to create and endorse the transactions that are proposed on the network.

Client nodes are Amazon EC2 instances configured with open source tools, such as CLI or SDK, in the member’s AWS environment. These are needed to configure blockchain applications on peer nodes and to identify and connect/interact with other network resources.

Smart contracts: Smart contracts in Hyperledger Fabric are a self-executing logic that represents agreements or a set of rules that govern transactions in a blockchain network, written in a machine readable and executable language. Smart contracts define the rules or logic of a clinical trial. For example, a clinical trial rules that a subject can be recruited to the clinical trial only if they have signed informed consent. A smart contract function is to be created which checks if the informed consent is signed before enrolling a new patient.

Amazon Virtual Private Cloud (Amazon VPC) and AWS PrivateLink: Within the blockchain network, access and authorization for each resource is governed by processes defined within the network. Outside the confines of the network – that is, from a member’s client applications and tools – Managed Blockchain uses AWS PrivateLink to ensure that only network members can access required resources. In this way, each member has a private connection from a client in their VPC to the Managed Blockchain network.

Use case: Patient consent workflow

When a sponsor creates a Managed Blockchain, it can request memberships from the CRO and the site to be part of the blockchain network. These become members of the blockchain with voting rights to accept or reject transactions. The site then starts recruiting patients. Once a patient is recommended to be added to the clinical trials, the transactions for that recruitment get recorded in the blockchain. The transactions may be as below:

Patient is provided with the informed consent document.
Patient is then provided with an e-consent form for signing the consent.
Patient gets a unique identifier for signing into the e-consent.
Patient signs into e-consent and accepts the consent terms.
Individual consent from the e-consent form (e.g., one consent for clinical trials, one for collection of data, and one for storing the biosample) gets recorded as transactions.
1. Once the transaction is requested, it is broadcasted to the network.
2. The network validates the transaction via known algorithms. Validation may include smart contracts and/or other records.
3. The transaction is unified with other transactions as a block of data.
4. The new block is added to the blockchain in a transparent and unalterable way.
5. The transaction is complete. Once the transaction is entered, the data cannot be deleted, and all the changes are tracked.
Every transaction (including read-only requests) are tracked in a blockchain.

Use case: Database lock

Database lock is necessary industry standard, which is one of the final steps taken during a clinical trial. The database locking prevents further editing to the data after data collection and validation. The database lock is generally frustrating for sponsors and CROs because it often leads to costly delays. This can be mitigated by using Managed Blockchain as it provides a clear audit trial for transactions and, combined with checksums, makes data validations easier. The transactions may be as below:

The CRO requests for database lock.
The transaction proposal request is sent to the member peer nodes of the network.
The transaction is simulated at each peer, then signed and endorsed by endorsing admins.
These endorsements reach the orderer, which verifies the agreed upon policy.
Once the policy is satisfied, the transaction is complete. The orderer sends updates to all of the members of the network to update the ledger.
A flag is added to the ledger confirming the database lock. This is conditioned by the smart contract functions that prevent any further editing to the database and also validates the data by verifying the checksum when reading the data.

Use case: Data analysis

After the data collection and storage, the data can be shared with research organizations and data analysts for analysis. The access control of the data is determined by the consent stored in the ledger and enforced by the smart contract functions. The transactions may be as below:

The research organization requests to be a part of the blockchain network.
The existing voting members vote, and the new member is accepted or rejected based on the voting policy.
The data is requested for analysis. The smart contract verifies the access rights stored on the ledger for requester authentication. The resulted transaction is simulated in the endorsing peers, and verified by the orderer.
Once the request is validated, the function itself queries the database to retrieve data of consented subjects, thereby sharing the data only with intended actors while maintaining a consent system.
If a subject withdraws consent for data sharing, the updated access rights on the ledger will restrict further data access.
Once the analysis is completed, the report is validated by the CRO and updated in the reports database through a smart contract function. It is also flagged in the ledger. Every request is recorded as a transaction the ledger.

Final Thoughts

We have shown only a few use cases for building trust in clinical data transactions and transparency. The extension of the use case to follow through on the entire clinical trials operation can be easily built using Amazon Managed Blockchain by adding new channels, new nodes, and users (such as IRBs, regulatory agencies, data monitors, drug supply, statisticians, physicians, and others).

Blockchain technology has the potential to help solve many challenges facing the clinical trials process, such as accurately reproducing and sharing data, privacy concerns, and patient enrollment strategies. More importantly, because the transactions are tracked chronologically and with full transparency, the data integrity, trust, and traceability are high throughout the whole clinical trials process. Amazon Managed Blockchain provides an easy way to create the infrastructure needed to use blockchain in clinical trials.

Click here to learn more about blockchain technology

Click here to learn more about AWS for Life Sciences