Blockchain node deployment on AWS: A comprehensive guide

In the evolving landscape of blockchain technology, understanding the intricacies of node deployment on AWS is paramount in order to be able to interact with the blockchain. In this post, we provide a comprehensive overview of the role nodes serve in blockchain networks, cover the spectrum of available node types, discuss use cases, and present a systematic guide to deploying these nodes on AWS.

Blockchains, often described as Distributed Ledger Technology, function across peer-to-peer networks. Essentially, they act as databases where the data is cryptographically signed and chronologically stored in ‘blocks’ linked together. Nodes, typically computers or devices, participate in these networks. As we cover later, there are different node types which serve different roles. Some will facilitate access to Decentralized Applications (Dapps) and some (typically validator/miner nodes) help secure and process transactions in ledger synchronization through consensus. Combining transactions data in a particular order into blocks in deterministic way is the method used by consensus mechanisms to achieve state synchronization between nodes. Creation and distribution of blocks applies only to consensus nodes. Other node types relay blocks created by consensus nodes. The more nodes that are present in a network, the more resilient it is.

Whether you are an individual seeking to engage with validator nodes in Proof of Stake frameworks, an enterprise client or DApp developer harnessing the power of Layer 1 (L1 – the base blockchain used by developers to build applications on) or Layer 2 (L2 – scaling solutions that handle activities off L1 blockchains to ease their transactional loads) blockchain protocols, or an organization overseeing L1/L2 networks and ensuring their robust blockchain infrastructure, this post serves as an essential reference to navigate your endeavors.

Different types of nodes in blockchain networks

Different L1 and L2 blockchain protocols might employ distinct terminologies for their node types, making it imperative to consult their specific documentation for accurate details.

For instance, in the Ethereum network, nodes can be categorized into three primary types: full nodes, archive nodes, and light nodes. Each of these Ethereum node types has specific functionalities and requirements. Additionally, Ethereum supports multiple clients, such as Geth and Erigon, developed and maintained independently, enhancing the network’s resilience against potential vulnerabilities.

Similarly, Bitcoin’s network classifies nodes into four main categories: full nodes, listening nodes (also known as super nodes), miner nodes, and lightweight or SPV clients. Each of these node types serves a unique purpose within the Bitcoin network.

The following figure summarizes the blockchain node types.

Illustration by Ian Holtz, AWS

To understand the various blockchain node types and their applications, let’s examine them in more detail:

Full nodes – Full nodes are more common for general use cases that require validation of the blockchain and storing recent state. Full nodes generally cannot vote, unless they are participating in a network that relies on proof of authority.
Archival nodes – These nodes include the entire dataset from the genesis block. They store the entire blockchain history, including all transactions ever made and every state of the blockchain at every block. It’s a comprehensive record of the blockchain’s entire history. Archive nodes serve specific use cases that require unlimited historical visibility.
Validator (staking) nodes – These participate in Proof of Stake blockchains, like Ethereum. They’re a special type of full node that participate in consensus—they participate in verifying, voting on, and maintaining a record of transactions. For most blockchains, staking is the process of locking up tokens on a blockchain network to earn token rewards (or yield) in exchange for securing the network. The reward for running a validator varies depending on the L1 or L2 blockchain, so it’s advised to check the yield returns prior to launching a validator and staking your tokens.
Mining nodes – These participate in Proof of Work blockchains like Bitcoin.
Authority nodes – These nodes are used by consensus algorithms for networks that aren’t fully decentralized, including Delegated Proof of Stake and Proof of Authority. In these networks, either the development team will decide how many authority nodes are needed and who will run them, or the community votes on the decision. The task of these nodes is the same as full nodes in other networks.
Master nodes – These maintain and validate transactions but don’t have the authority to add new blocks to the ledger.
Pruned nodes – These nodes only have the latest dataset or block and a summarization of everything before.
Lightweight nodes – These store and provide the necessary data to accommodate daily activities or faster transactions.
Special nodes – There are two additional types of special nodes:
Super nodes – These are configured to carry out specific functions such as running software updates or maintaining the rules.
Lightning nodes – These are used when creating a separate network from the main blockchain. They’re used for faster and cost-effective transactions.

Potential use cases for various node types

In this section, we discuss potential use cases for different node types.

Note: in this section on use cases, we refer to RPC nodes. These are nodes, such as full, archive or light nodes, that provide access to Remote Procedure Calls (RPCs).

Yield generation through staking

This typically involves the operation of validator and Remote Procedure Call (RPC) nodes. Although running a personal full node isn’t mandatory, it’s a widely accepted best practice among validators for several reasons:

Validators handle block production. To do this, they require access to the chain’s current state, which they get by querying the full node. Although it’s possible to query a remotely managed full node, many validators opt to manage their own for enhanced security and reliability, and to promote network health.
Validators receive transactions from full nodes, validate these transactions, produce the block, and return it to the chain. This exchange occurs via a full node, further underscoring the preference many validators have for managing their own full nodes.

AppChains: Running applications on private blockchains

If the objective is to run applications on a dedicated enterprise-grade blockchain, without vying for resources on a public blockchain, AppChains could be the ideal solution. For instance, the Cosmos AppChain is an exemplary solution for crafting AppChains equipped with advanced features. This usually necessitates operating both validator and RPC nodes. To learn more, refer to: Use Cosmos technology to deploy an enterprise consortium chain on AWS.

Data solutions built on top of blockchains

For those constructing specialized data solutions, such as indexers or data warehouses, which necessitate continuous polling of RPC nodes to retrieve and archive data, operating an independent node can lead to cost savings, enhanced latency (especially when your polling stack is co-located with the RPC node), and possibly augmented performance.

Smart contracts: Deploying applications on public blockchains

If the aim is to create and implement smart contracts on a public blockchain, an RPC node becomes essential for transmitting transactions to the chain. The decision boils down to whether you should run your own RPC node either by self-managing the node on Amazon Elastic Compute Cloud (EC2) or using AWS native offerings such as Amazon Managed Blockchain, or outsource it to third-party node providers like Alchemy, Blockdaemon, Infura, Chainstack, Quicknode, or others (refer to next section for more information on these deployment options). Although third-party services might offer a cost-effective and swift setup for startups and developers, as operations scale, managing a personal node could become more advantageous. This approach offers increased transaction throughput, removes the need to share resources with other users, and avoids potential rate limiting. Additionally, having control over your node provides enhanced transaction control, particularly over the mempool. Furthermore, with an exclusive node, integration into downstream blockchain data analytics pipelines becomes feasible, using AWS services such as Amazon Kinesis Data Streams, Amazon Managed Service for Apache Flink, Amazon Managed Streaming for Apache Kafka (Amazon MSK), and Amazon Simple Storage Service (Amazon S3). Worth noting, for all cases, the Amazon Managed Blockchain (AMB) Query service can be used to access commonly requested blockchain data, such as full historical balances and transactions, with sub second latency.

Deployment options on AWS for various node types

We can split the deployment options into three main categories:

AWS native offerings, such as Amazon Managed Blockchain, AWS Solutions Library, AWS Blockchain Node Runners, and AWS Marketplace
Self-managed deployment, such as constructing the node independently on Amazon Elastic Compute Cloud (Amazon EC2)
Managed services, such as third-party node service providers

AWS native offerings

In this section, we discuss various AWS native offerings.

Amazon Managed Blockchain

Managed Blockchain offers a seamlessly managed, resilient, and scalable blockchain node infrastructure. This facilitates the development of blockchain applications without the hassle of handling foundational compute, storage, and networking aspects. Options include:

Ethereum – For Ethereum-based projects, dedicated, single-tenant Ethereum nodes through Managed Blockchain allow seamless interaction with Ethereum’s mainnet and select testnets. As of this writing, the only supported node type is the Full node (Geth), integrating the Geth execution client with the Lighthouse consensus client.
Bitcoin and Polygon – For Bitcoin and Polygon integrations, users can tap into a serverless, multi-tenant service with instant access to Bitcoin and Polygon RPCs, with the cost model tied to API utilization.
Hyperledger Fabric – For those aiming to construct a private blockchain, Managed Blockchain offers private Hyperledger Fabric solutions.

For further insights, refer to the catalog of Managed Blockchain blog posts

Managed Blockchain also offers Amazon Managed Blockchain (AMB) Query. This provides serverless access to standardized, multi-blockchain datasets with developer-friendly APIs.

AWS Solutions Library and AWS Blockchain Node Runners

AWS Solutions Library contains reference architectures. With AWS Solutions Library, you can learn best practices for running nodes on AWS and study architectural guidance on how to construct Highly Available architectures, add monitoring/observability and logging to deployed nodes, and information on how to upgrade, and more.

Node Runners is an open-source initiative to develop infrastructure-as-code applications to run various chains on AWS. With AWS Blockchain Node Runners, anyone can get access to a comprehensive range of Infrastructure as Code (IaC) applications as well as deployment examples for different blockchain nodes with infrastructure configurations that are fit for different scenarios. With templates and recommendations now available for popular blockchains such as Ethereum (- with more coming soon), everyone is welcome to join the efforts and build with us. In addition to the ready-to-deploy set of IaC applications, the ready-to-deploy set of AWS Cloud Development Kit applications allows you to customize the infrastructure to suit your needs in order to set up and run the nodes. You are also welcome to publish your own results and recommendations for the benefits of the wider community.

AWS Marketplace

AWS Marketplace is a curated digital catalog that makes it easy for organizations to discover, procure, entitle, provision, and govern third-party software.

Many Web3 customers, including L1 and L2 customers, have created AWS Marketplace listings for different node types and professional services offerings. The following is a filtered list of the Marketplace listings for blockchain customers. Some examples of node deployments include:

Ethereum Nodes (this is a filtered list of providers with Ethereum node types listed in AWS Marketplace)
Avalanche
Zilliqa
CasperLabs
Algorand Blockchain
Celo Blockchain Full Node
XRP Ledger Node Server
Fantom Blockchain node (this includes fulfilment options for both Read Only nodes and API nodes)
Bitcoin nodes; the following are example listings of Bitcoin full nodes:
- Syntactic Engineering
- Techlatest.net

Note that you can use Scale3 Autopilot for logging, monitoring, and observability for any nodes spun up on AWS. Scale3 is a Web3 infrastructure company, building developer tools. Their Autopilot platform is an end-to-end observability platform for nodes and validators, including those nodes that are deployed through AWS Marketplace, or any other deployment option. As of this writing, they support over 12 networks, including Sui, Ethereum, Avalanche, Polygon, Base, NEAR, Solana, Flow, Harmony, Aptos, Cosmos, and Filecoin. Scale3 is also continuing to expand their support for additional blockchain networks. Sign up via Scale3 AWS Marketplace offering today to get started.

If you’re an L1 or L2 customer wishing to create an AWS Marketplace listing, then you’ll first need to sign up and register on the AWS Partner Network (APN) website, then select an AWS Partner Path and pay the annual fee. Then you can sign up for AWS Marketplace in order to create your listing. Refer to the APN and AWS Marketplace portals for detailed instructions on the signup and registration process. Connect with your AWS Account Manager to learn more about the processes and benefits.

Self-managed deployment

If you can’t find a node variant in the AWS offerings or prefer a more hands-on, tailored approach, Amazon EC2 provides the flexibility to configure and oversee your node.

This requires the following:

Consulting the official documentation provided by the respective L1 or L2 organization, often available on their official website or community channels like Discord.
Assessing and determining the computational and storage resources necessary for your nodes. For insights into this process, refer to the blog post: Choose AWS Graviton and cloud storage for your Ethereum nodes infrastructure on AWS.

As an alternative to a fully self-managed approach, third-party solutions like Scale3’s NodePilot can facilitate and streamline node setup and upkeep. Scale3 NodePilot is a tool that simplifies node hosting on your own infrastructure. As of this writing, NodePilot supports six different blockchain networks, including Sui, Ethereum, Avalanche, Polygon, Flow, and Aptos. With NodePilot, you can do the following:

Run a multi-cloud or on-premises compute strategy
Run nodes on protocols outside of what’s available via a managed service powered by a cloud provider
Own and manage your own nodes without hitting any rate limits or data availability problems.
Have more control over mempool data, on-chain data, and customization capabilities of the node
Co-locate nodes with your data indexing instances

Sign up via Scale3 AWS Marketplace offering today to get started. To learn more about NodePilot, refer to Managing Blockchain Nodes using NodePilot and reach out to them to request access.

Managed services

Blockchain-as-a-service enterprises, known as third-party node providers, handle the intricate operations essential for maintaining the blockchain network. They allocate core resources and employ cutting-edge technologies to establish and sustain blockchain nodes. By utilizing these services, you can route your requests to a provider’s online node rather than a local setup. This ensures access to continuously synchronized, current nodes anytime, anywhere.

Such services are particularly beneficial for DApp developers, especially during their initial launch phase.

Case Study Ethereum:

If you are working with Ethereum layer 1 protocol, then your node deployment options will depend upon your persona. Below is a breakdown of the deployment options for each persona:

- DApp Builder: Options include AMB Ethereum, third-party node providers, or EC2-based self-managed nodes.
- Enterprise Customer (such as Analyst): Choices range between AMB public Ethereum and EC2-based self-managed nodes.
- Ethereum Validator (Solo-Staking): Options span marketplace offerings like Launchnodes or EC2-based self-managed nodes.
- Ethereum Foundation: Typically prefer to run EC2-based self-managed nodes.

Conclusion

In this post, we delved into the world of blockchain node deployment on AWS. We started by discussing the essential role of nodes in blockchain networks and their significance in enhancing decentralization and robustness. We then explored the wide array of stakeholders, from individuals to enterprise clients, and then emphasized the differences in terminologies across L1 and L2 blockchain protocols and the importance of consulting specific documentation. We then looked at the different node types, their potential use cases and how to deploy each on AWS. We then ended by show casing the node deployment options available to different personas working with Ethereum.

Now, with this knowledge in hand, you are primed to undertake your blockchain node deployment on AWS.

About the Authors

Ian Holtz A Certified AWS Solutions Architect, leads Web3 & AI AWS startups team, Asia Pacific & Japan regions. He co-creates technical go-to-market strategies and supports the construction of well-architected, scalable, and cost-effective technical deployments.

James Burdon is a Senior Blockchain Specialist Solutions Architect at AWS, focused on helping Web3 startups. James has over 25 years of IT consultancy experience and has been helping startups running on AWS for over 6 years.

Yue Ning is a Senior Enterprise Solutions Architect with Amazon Web Services in Southern California. She has over ten years of experience in both application development and cloud platform development. She is also a passionate advocate for women in the blockchain space.

Kristian Chartier is a Lab Developer based in Canada. He helps customers understand how generative AI and DevOps can transform the ways in which they create, test, and deploy applications.