Overview
This Guidance demonstrates how to migrate tabular data from Amazon Simple Storage Service (Amazon S3) general purpose buckets to Amazon S3 Tables, purpose-built storage for tabular data. S3 Tables introduces a new bucket type, S3 table bucket, that stores fully managed Apache Iceberg tables to deliver up to three times faster query performance and up to ten times higher transactions per second compared to storing Iceberg tables in Amazon S3 general purpose buckets.
The Guidance sets up an automated migration process for moving Apache Iceberg and Apache Hive tables registered in AWS Glue Data Catalog and stored in Amazon S3 general purpose buckets to Amazon S3 table buckets using AWS Step Functions and Amazon EMR with Apache Spark. With built-in support for Apache Iceberg, you can query tabular data in S3 table buckets with popular query engines including Amazon Athena, Amazon Redshift, and Apache Spark.
How it works
These technical details feature an architecture diagram to illustrate how to effectively use this solution. The architecture diagram shows the key components and their interactions, providing an overview of the architecture's structure and functionality step-by-step.
Get Started
Well-Architected Pillars
The architecture diagram above is an example of a Solution created with Well-Architected best practices in mind. To be fully Well-Architected, you should follow as many Well-Architected best practices as possible.
Related Content
Disclaimer
Did you find what you were looking for today?
Let us know so we can improve the quality of the content on our pages