Implement a CI/CD pipeline for Ethereum smart contract development on AWS – Part 1

Continuous integration and continuous delivery (CI/CD) is a process that automates software development workflows and deploys better quality software that avoids bugs and code failure. CI/CD removes the manual human intervention that was traditionally needed to get code changes from development environments to production servers. With a CI/CD pipeline, code changes are automatically detected, built, tested, and pushed to production environments. CI/CD is an essential part of DevOps in any organization that practices modern software development techniques. CI/CD helps development, security, and operations teams work together closely and as efficiently and effectively as possible. It decreases tedious and time-consuming manual development work and legacy approval processes, freeing DevOps teams to be more innovative in their software development. Many modern applications are very complex in nature, and that’s certainly true when it comes to building a decentralized application for blockchain technologies. Even though many blockchain development teams understand the benefits of CI/CD, lack of guidance and usable artifacts prevents many blockchain developers from incorporating CI/CD in their development workflow. The purpose of this blog series is to address that shortcoming and provide guidance as well as ready to use artifacts for implementing a CI/CD pipeline for blockchain application development.

This is the first post of a two-part series. In this post, we give an overview of the architecture and workflow of the smart contract CI/CD pipeline. The second post in this series walks you through a complete AWS Cloud Development Kit (AWS CDK) implementation of the CI/CD pipeline. CDK is a framework for defining cloud infrastructure in code.

Blockchain and smart contracts

Blockchain is a distributed immutable ledger technology that facilitates the process of recording transactions and tracking assets amongst a distributed set of participants. The distributed ledger is shared among many computer nodes that are in sync with each other using a consensus protocol.

Ethereum was the first blockchain network that introduced the concept of smart contracts. A smart contact is an executable code that is distributed across all the computer nodes participating in the blockchain network. In a public blockchain network, like Ethereum, a smart contract allows business logic and workflows to be visible to any participant in the network and establishes trust among the participants.

Even though Ethereum was the first network that introduced the concept of a smart contract, most blockchains have now incorporated smart contract concepts within their networks. For the purpose of this post, the focus will be on the development of smart contracts related to Ethereum and other Ethereum Virtual Machine (EVM) compatible networks. EVM is the execution environment of smart contracts in Ethereum and, there are other blockchain networks that implement the same executable environment within itself.

Smart contracts for EVM are written in purpose-built languages, of which Solidity is the most popular. Even though all the sample code accompanying this post uses Solidity as the language for the smart contract development, the techniques presented in this post can be applied to other languages.

Smart contract development

Smart contract development involves writing the code in a language like Solidity, compiling it using the appropriate tools, and then deploying it on a blockchain network for testing and debugging. There are three main types of blockchain networks where the smart contract code can be deployed:

Ethereum Mainnet – This is the production Ethereum network and requires real Ether, the Ethereum digital currency, to deploy the smart contract. In every blockchain network, its underlying digital currency is used to pay for the compute and storage resources needed to operate the blockchain network. Amazon Managed Blockchain (AMB), a fully managed blockchain service, offers dedicated nodes to connect to the Ethereum Mainnet.
Ethereum Testnets – These are test networks that don’t require real Ether but requires test Ethers that can be obtained for free from their respective faucets to deploy smart contracts. Ethers acquired from a test network faucet have no real monetary value but nonetheless there is a limited amount of availability of these Ethers. Developers can use Testnets to test and integrate their smart contracts with other third party smart contracts. Amazon Managed Blockchain offers support for many popular test networks, including Goerli. We discuss deploying to the Goerli network in the context of CI/CD in this post.
Development networks – Because both the Testnets and Mainnet networks require obtaining Ethers to deploy and write to the blockchain network, it is not feasible to use them for development and testing purposes. For this reason, a development network, which is usually run on a single computer, is widely used for this purpose. A development network can allow unlimited amount of test Ethers without any restriction and therefore, is ideal for debugging and testing. Ganache is one such development network. For a complete step-by-step guide of how to set up Ganache and other blockchain development tools, refer Develop a Full Stack Serverless NFT Application with Amazon Managed Blockchain – Part 1.

Smart contract development requires developers to acquire the skills and knowledge of a new programming language as well as specialized tools and IDE extensions specifically designed for smart contract development. It’s common for a large team working on a decentralized application (dApp) to have one group of developers working on the smart contract while another group of developers working on the front-end or middle-tier components of the application. Because most of the blockchain development network runs on a single computer on a localhost endpoint, supporting a multi-developer environment where a front-end application or a middle-tier application needs to connect to a network running the smart contract becomes challenging. Another challenge faced by many multi-developer teams is agreeing on a single developer framework for the smart contract development. Truffle and Hardhat are two very popular development frameworks, but most developers choose one or the other, so combining their work for an integrated testing and building environment becomes challenging. Implementing a CI/CD pipeline for smart contract development addresses both of these challenges.

Solution overview

The following diagram shows many AWS services that work together to support the CI/CD pipeline for the smart contract development.

In the following sections, we discuss the key components in more detail.

Compute

The first component in the CI/CD infrastructure is a blockchain development network that can support many developers connecting to it for integration testing. In this reference architecture, we use Hyperledger Besu as a blockchain development network. It’s configured to run on Amazon Elastic Container Service (Amazon ECS). Hyperledger Besu is an Ethereum client designed to be enterprise-friendly for both public and private network use cases, with an EVM implementation. Amazon ECS is a fully managed container orchestration service that simplifies deployment, management, and scaling of containerized applications. Amazon ECS comes in two offerings: one that requires configuring Amazon Elastic Compute Cloud (Amazon EC2) instances and the other, AWS Fargate, which is a serverless offering.

In the above solution architecture, functions in the smart contract are invoked by code running in AWS Lambda, a serverless, event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers. The CI/CD pipeline uses AMB to connect to both the Ethereum Mainnet and Goerli test network.

Storage

Just like any other blockchain network, Hyperledger Besu has an underlying ledger database storage. In the reference implementation, Amazon Elastic File System (Amazon EFS) is being used to store the ledger database and all the configuration files to run Hyperledger Besu. Amazon EFS is a simple, serverless, elastic, set-and-forget file system that automatically grows and shrinks as you add and remove files with no need for management or provisioning. You can use Amazon EFS with Amazon EC2, Lambda, Amazon ECS, Amazon Elastic Kubernetes Service (Amazon EKS), and other AWS compute instances, or with on-premises servers. Separating compute from storage when running Hyperledger Besu offers a high level of scalability as well as availability. Configuration files for Hyperledger Besu are copied from Amazon Simple Storage Service (Amazon S3) to Amazon EFS before starting the Besu network. AWS DataSync copies Besu configuration files from Amazon S3 to Amazon EFS. DataSync is an online data movement and discovery service that simplifies data migration and helps you quickly, easily, and securely transfer your file or object data to, from, and between AWS storage services.

DevOps

An integral part of any CI/CD implementation is a code repository that is used to store all the code changes submitted by many developers working on an application code. AWS CodeCommit is a version control service that allows you to store and manage your Git repository. As developers finish testing their code locally, they push the unit tested code to CodeCommit, which will then start the CI/CD pipeline.

Much of the automation of building, testing, and deploying smart contract code to different blockchain networks is performed by AWS CodeBuild. CodeBuild is a fully managed build service that compiles your source code, runs tests, and produces artifacts that are ready to deploy.

Whenever a smart contract is deployed on the blockchain network, a new contract address is created, and that address is used to establish connection with the deployed smart contract. Any code that needs to invoke functions in the smart contract needs to have access to the smart contract address as well as the smart contract’s application binary interface (ABI). ABI defines functionality exposed by one binary program to another. In the smart contract’s context, it is the functionality that the smart contract implements and that can be invoked by any code outside of the smart contract. In the reference implementation, the REST API layer implemented via Lambda invokes the smart contract and therefore need to have access to the contract address and ABI. As part of the CI/CD process, when CodeBuild deploys a smart contract to a blockchain network, it also updates the Lambda code with the latest contract ABI and address.

Lastly, AWS CodePipeline defines the CI/CD automation workflow by integrating CodeCommit and CodeBuild to automate the entire release pipeline.

Security

Several AWS Identity and Access Management (IAM) policies and roles are defined to give access to various AWS services to perform various tasks in the CI/CD pipeline. These roles and policies and the specific access they define are discussed in detail in the second part of this series.

A smart contract deployment to the blockchain network always uses a wallet account with a sufficient balance in Ether to be able to do the deployment. A development network like Ganache or Besu creates some test accounts, with an arbitrary number of Ethers given to these accounts for testing and debugging purposes. Regardless of which blockchain network a smart contract is deployed to, CodeBuild needs access to the private keys of the wallet that is used to deploy the smart contract. Here we are using AWS Secrets Manager to store mnemonics associated with the hierarchical deterministic (HD) wallet to retrieve the private key of the account that would be used to deploy the smart contract. Secrets Manager helps manage, retrieve, and rotate the database credentials, API keys, and other secrets. Access to mnemonics means that all the private keys associated with that mnemonic can be accessed and therefore, extreme care should be taken to protect the mnemonic string.

CI/CD pipeline workflow

In this section, we discuss the logical flow of how the smart contract CI/CD pipeline is implemented. Part 2 of this series walks you through the AWS CDK code that implements this entire pipeline.

The following diagram shows the complete CI/CD pipeline.

CI-CD Flow

Step 1: Developer commits code

Each developer works on their development environment using their choice of IDE and development framework (such as Truffle or Hardhat). When they have finished their unit tests on their code, they push the code to the Git repository on CodeCommit. The code submitted to CodeCommit will have all the smart contract code (.sol files) in a contract folder and package.json will have any dependencies the smart contract code has. If there are test scripts, those can also be added to the test folder. Figure 1 shows what the code repository may look like.

Smart Contract Code
Figure 1

Step 2: CodePipeline triggers CodeBuild to compile the smart contract

In order for the CI/CD pipeline to start the release pipeline process, the code has to be in the master/main branch. If there are any approval processes to push the code to master/main branch, then that should be incorporated in the pipeline workflow. CodePipeline monitors the Git repository and a push to the main branch will trigger CodeBuild to start compiling the smart contract. The pipeline workflow looks like the figure 2.

CI/CD Step 1
Figure 2

Step 3: CodeBuild compiles and deploys the smart contract to Hyperledger Besu

CodeBuild compiles the smart contract, which generates a new smart contract ABI. If the smart contract compiles without any errors, CodeBuild will deploy the smart contract code to the Besu blockchain network. To deploy the code to the blockchain network, it gets the private key of the account that will be used to deploy the code. CodeBuild computes this private key based on the HD wallet mnemonics stored in Secrets Manager. When CodeBuild deploys the contract to Besu, it gets the contract address associated with the deployed contract. Figure 3 shows this step.

CI/CD Step3
Figure 3

Step 4: CodeBuild runs test scripts

After the smart contract is deployed to Hyperledger Besu, CodeBuild runs any tests that are found in the test folder of the git repository of the smart contract. The sample implementation in the second part of the series uses Chaijs and Mochajs test frameworks to define test scripts.

Step 5: CodeBuild updates the Lambda function

CodeBuild updates the Lambda function that is part of the REST API layer. In addition to the smart contract, the Git repository contains the code that forms the REST API layer. The AWS CDK code sample that Part 2 will discuss has the code for the REST API layer.

As shown in figure 4, the index.mjs file contains the Lambda handler and the AssetToken.json file is the contract ABI file that the index.mjs needs to invoke functions in the smart contract.

CI/CD Lambda Code
Figure 4

In this step, CodeBuild updates the Lambda function with the correct ABI stored in AssetToken.json. It also updates an environmental variable of the Lambda function that contains the smart contract address.

Step 6: A manual approval process triggers next stage of pipeline

The sample implementation in the second part of the series includes a manual approval moving from the development stage to the Goerli deployment stage, but organizations can choose to make this transition without any manual approval process.

Step 7: Smart contract is deployed to Testnet or Mainnet using AMB

The Besu network provides the infrastructure for integrated testing for the development stage. The next stage in this pipeline could be deploying the smart contract to a test network like Goerli or to Mainnet. In the sample implementation, which is shown in Part 2 of this series, the smart contract is deployed to the Goerli network after a manual approval.

Conclusion

This concludes Part 1 of this series. In this post, we showed an overview of the AWS infrastructure needed to implement a CI/CD pipeline for an EVM-compatible smart contract. We also provided a step-by-step guide of how the CI/CD pipeline will work.

In Part 2 of this series, we show an end-to-end implementation of this CI/CD pipeline. We demonstrate how to create the entire infrastructure using the AWS CDK and explain each component of the AWS CDK code in detail.

About the Author

Rafia Tapia is a Blockchain Specialist Solution Architecture. Having more than 27 years of experience in software development and architecture, she has a keen interest in developing design patterns and best practices for smart contracts and blockchain technologies.