Guidance for Cross-Chain Analytics using Bitcoin and Ethereum Open Data on AWS
Overview
How it works
These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Operational Excellence
With Managed Blockchain, you can complete the deployment of Ethereum full node(s) to connect to public testnets and the Ethereum mainnet in a matter of minutes. This is in contrast to the slow deploy and sync times of self-hosted Ethereum nodes that can take 24-36 hours. We have built observability into the architecture with process-level metrics, logs, and dashboards. Extend these mechanisms to your needs, and create alarms in Amazon CloudWatch to inform your on-call team of any issues. Finally, you can automate the deployment of this Guidance with infrastructure as code frameworks such as AWS Cloud Development Kit (CDK) or AWS CloudFormation.
Security
This Guidance uses role-based access with AWS Identity and Access Management (IAM). The Amazon S3 bucket has encryption enabled, is private, and blocks public access. All roles are defined with least-privilege access, and all communications between services stay within the customer account. Administrators can control access to the Jupyter notebook, SageMaker, Amazon Redshift, Athena, and QuickSight through IAM roles.
Reliability
Various components in the architecture are deployed across multiple Availability Zones, such as the Managed Blockchain Ethereum nodes. By nature, all the serverless components, such as Fargate, are highly available and automatically scale to accommodate demand.
Performance Efficiency
This Guidance uses serverless technologies, which provide built-in fault tolerance and continuous scaling. Serverless services also allow for comparative testing against varying load levels and minimizes undifferentiated tasks like capacity provisioning and patching, so you can focus on business needs rather than server management. Further, you can enable auto scaling for AWS Glue, which will automatically remove workers from the cluster depending on the parallelism at each stage of the job run. Similarly, Amazon S3 automatically scales to meet high request rates. There are no limits to the number of prefixes in a bucket, and you can increase read or write performance through parallelization.
Cost Optimization
By using the AWS Glue serverless computing platform for ETL and Athena for serverless query, you pay only for the resources you use. To further optimize cost, you can use the Amazon S3 Intelligent-Tiering storage class, which automatically selects the ideal cost-effective storage tier for your content depending on its access patterns, such as frequency of access.
Sustainability
By using managed services such as Fargate and AWS Glue, we minimize the environmental impact of the backend services. Furthermore, public Ethereum blockchain shifted from the proof-of-work to the proof-of-stake consensus mechanism in late 2022, reducing Ethereum’s energy consumption by ~99.5 percent.*
*The Merge, Ethereum, April 19, 2023.
Implementation Resources
The sample code is a starting point. It is industry validated, prescriptive but not definitive, and a peek under the hood to help you begin.
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages