Amazon Web Services
In this AWS re:Invent 2023 session, Netflix engineers Ashwin Kayyoor and Rakesh Veeramacheneni share their journey of modernizing Netflix's massive data lake using Apache Iceberg. They discuss the challenges of managing an exabyte-scale data warehouse and the transition from a Hive-based system to an Iceberg-only architecture. The presentation covers the development of custom tooling, ecosystem services, and unique features like secure Iceberg tables and the Iceberg REST catalog. The speakers detail the migration process, including strategies to minimize data movement and user friction while ensuring business continuity. They also highlight the benefits of Iceberg, such as ACID transactions, rich metadata layers, and improved query performance. The talk provides valuable insights for organizations looking to scale their data infrastructure and leverage open-source table formats for efficient data management.