Detalhes do evento
segunda-feira, 3 de março de 2025 - terça-feira, 4 de março de 2025
21:00 - 03:00 GMT
Startup Data Analytics: Transactional Data Lakes with Apache Iceberg
segunda-feira, 3 de março de 2025 - terça-feira, 4 de março de 2025
21:00 - 03:00 GMT
PRESENCIAL
English
300 - Avançado
segunda-feira, 3 de março de 2025 - terça-feira, 4 de março de 2025
21:00 - 03:00 GMT
Apache Iceberg is an open table format for storing large data sets on Amazon S3. It is a foundational tool for building transactional data lakes on S3. Iceberg support ACID transactions, compaction, and schema evolution. Query engines such as Athena, Spark, and Trino can query Iceberg tables natively using SQL.
Apache Iceberg is popular with startups as it takes away the undifferentiated heavy lifting associated with managing data in S3. In particular, it handles partitions, and compaction which is a common pain (and cost) problem. This becomes especially relevant for event stream data (either CDC from databases, or IoT event streams) as it can quickly, and automatically, add records to an Iceberg table while also mitigating the cost impact of ‘lots of small files’ on S3.
In this Immersion Day we will dive deep on Iceberg, and how it can be used on AWS to scalably, and cost effectively run analytics on your data. This workshop will be a combination of presentations and hands-on workshops.
This Immersion Day will be useful if you are running analytics workloads on large columnar-style datasets or streaming events for analytics (e.g. IoT).
Please bring your laptop to participate, and remember to bring photo ID for entry.
Agenda (subject to change)
9:00 PM UTC
Iceberg introduction: transactional data lake concepts and use cases
9:30 PM UTC
Workshop: Creating a transactional Iceberg data lake with AWS Glue and Amazon Athena
10:15 PM UTC
Transformation with AWS Glue, and Iceberg maintenance
11:30 PM UTC
Workshop: Transformation with AWS Glue, and AWS Schema features
1:00 AM UTC
CDC/Eventing to Iceberg with Amazon Kinesis Firehose
1:15 AM UTC
Workshop: Amazon Kinesis Firehose as a source for Iceberg
2:00 AM UTC
Demo and wrap up