Amazon S3 Tables

Optimize query performance and cost as your data lake scales

Store tabular data at scale in S3

Amazon S3 Tables deliver the first cloud object store with built-in Apache Iceberg support and streamline storing tabular data at scale. Continual table optimization automatically scans and rewrites table data in the background, achieving up to 3x faster query performance compared to unmanaged Iceberg tables. These performance optimizations will continue to improve over time. Additionally, S3 Tables include optimizations specific to Iceberg workloads that deliver up to 10x higher transactions per second compared to Iceberg tables stored in general purpose S3 buckets. For more details on S3 Tables’ query performance improvements, refer to the blog.

With S3 Tables support for the Apache Iceberg standard, your tabular data can be easily queried with popular AWS and third-party query engines. Use S3 Tables to store tabular data such as daily purchase transactions, streaming sensor data, or ad impressions as an Iceberg table in S3, and optimize performance and cost as your data evolves using automatic table maintenance. Read the blog to learn more.

Benefits

How it works

S3 Tables provide purpose-built S3 storage for storing structured data in the Apache Parquet format. Within a table bucket, you can create tables as first-class resources directly in S3. These tables can be secured with table-level permissions defined in either identity- or resource-based policies and are accessible by applications or tooling that supports the Apache Iceberg standard. When you create a table in your table bucket, the underlying data in S3 is stored as Parquet data. Then, S3 maintains the metadata necessary to make that Parquet data queryable by your applications. Table buckets include a client library that is used by query engines to navigate and update the Iceberg metadata of tables in your table bucket. This library, in conjunction with updated S3 APIs for table operations, allows for multiple clients to safely read and write data to your tables. Over time, S3 automatically optimizes the underlying Parquet data by rewriting, or "compacting” your objects. Compaction optimizes your data on S3 to improve query performance and minimize costs. Read the user guide to learn more

Customers

  • Genesys

    Genesys is a global cloud leader in AI-Powered Experience Orchestration. Through advanced AI, digital and workforce engagement management capabilities, Genesys helps more than 8,000 organizations in over 100 countries to provide personalized, empathetic customer and employee experiences while benefiting from improved business agility and outcomes.

    Amazon S3 Tables will be a transformative addition to our data architecture, especially with its managed Iceberg support, which effectively creates a materialized view layer for diverse data analysis needs. This offering has the potential to help Genesys simplify complex data workflows by eliminating extra layers of table management, with S3 handling key maintenance tasks like compaction, snapshot management, and unreferenced file cleanup automatically. The ability to read and write Iceberg Tables directly from S3 will help us boost performance and create new possibilities for integrating data seamlessly across our analytics ecosystem. This interoperability, combined with the performance enhancements, positions S3 Tables as a pivotal part of our future strategy to deliver fast, flexible, and reliable data insights.

    Glenn Nethercutt, Chief Technology Officer - Genesys
  • Pendulum

    Pendulum is a Brand Intelligence platform that has the world's most comprehensive coverage across video, audio, and text content to proactively identify risks and opportunities, enabling better decision-making and monitoring analytics across the enterprise.

    At Pendulum Intelligence, we analyze data from hundreds of millions of social channels and sources. Amazon S3 Tables has transformed how we manage our data lake, which processes thousands of hours of analyzed video and audio content while extracting context from images and other media in near real-time using our proprietary machine learning tools. By eliminating the burden of table management, including compaction, snapshots, and file cleanup, our team can focus on what matters most: deriving actionable insights from massive datasets. The seamless integration with our analytics stack—Amazon Athena, AWS Glue, and Amazon EMR—has significantly enhanced our ability to process complex data at scale.

    Abdurrahman Elbuni, Cloud Big Data Architect - Pendulum
  • SnapLogic

    SnapLogic is a pioneer in AI-led integration. The SnapLogic Platform for Generative Integration accelerates digital transformation across the enterprise to design, deploy, and manage AI agents and integration that automate tasks, make real-time decisions, and integrate effortlessly into existing workflows.

    Amazon S3 Tables, with built-in Apache Iceberg support and AWS Analytics services integration, help companies optimize their data analytics costs while transforming how they use business data for analytics, compliance, and AI initiatives. By automating complex data management tasks and providing complete audit trails of data changes, teams can instantly analyze historical data, maintain regulatory compliance, and accelerate business insights while significantly reducing their technology costs.

    Dominic Wellington, Enterprise Architect - SnapLogic
  • Zus Health

    Zus is a shared health data platform designed to accelerate healthcare data interoperability by providing easy-to-use patient data via API, embedded components, and direct EHR integrations.

    As a healthcare company handling massive amounts of frequently changing patient data, we decided to invest in Apache Iceberg because it solves many pain points with Apache Hive around partitioning and automation, with the added benefit of wider interoperability. One of our biggest challenges with Iceberg has been understanding and managing table optimization. This is why we’re excited about S3 Tables and the managed optimization capabilities. Being able to offload the developer overhead of table maintenance will allow us to focus more on bringing high-quality data and valuable insights to our customers.

    Sonya Huang, Consulting Software Engineer - Zus Health

Partners and integrations

  • Daft

    Daft is a unified engine for data engineering, analytics, and ML/AI, exposing both SQL and Python DataFrame interfaces as first-class citizens and is written in Rust. Daft provides a snappy and delightful local interactive experience, while also seamlessly scaling to petabyte-scale distributed workloads.

    Amazon S3 Tables is the perfect complement to Daft’s support for Apache Iceberg. By leveraging its integrations with AWS Lake Formation and AWS Glue, we were able to effortlessly extend our existing Iceberg read and write capabilities to S3 Tables while taking advantage of its optimized performance. We look forward to the evolution of this new service, and we are excited to provide the best in class S3 Tables support for the Python Data Engineering & ML/AI ecosystem.

    Sammy Sidhu, CEO & Co-Founder - Daft
  • Dremio

    Dremio is the intelligent lakehouse platform, accelerating AI and analytics by offering a market-leading SQL engine, an open, interoperable data catalog, and a secure, scalable, and simple-to-use platform. Our leadership in the Apache Iceberg, Apache Polaris (incubating), and Apache Arrow communities empowers organizations to build fully open, high-performance lakehouse architectures while maintaining flexibility and control—eliminating vendor lock-in.

    Dremio is pleased to support the general availability of Amazon S3 Tables. By supporting the Apache Iceberg REST Catalog (IRC) specification, S3 Tables ensure seamless interoperability with Dremio, enabling users to benefit from a high-performance SQL engine capable of querying Apache Iceberg tables managed in optimized S3 table buckets. This collaboration reinforces the importance of open standards in the lakehouse ecosystem, eliminating integration complexity and accelerating customer adoption. With Amazon S3 Tables and IRC support, organizations gain the flexibility and choice needed to build a unified lakehouse architecture in the AI era.

    James Rowland-Jones, VP, Product - Dremio
  • DuckDB Labs

    DuckDB Labs is the company founded by the creators of DuckDB, a popular universal data wrangling tool. The company employs the core contributors to the DuckDB system. DuckDB is Free and Open-Source software under the MIT license and is governed by the independent non-profit DuckDB Foundation. The DuckDB project makes fast analytical processing available for a wide audience through its ease-of-use and portability.

    Amazon S3 Tables aligns perfectly with DuckDB's vision for democratizing data analytics using open file formats. The collaboration between AWS and DuckDB Labs allows us to further extend Iceberg support in DuckDB and develop seamless integration with S3 Tables. We believe the shared batteries-included mentality of DuckDB and S3 Tables combines into a powerful analytics stack that can handle a wide range of workloads while maintaining an incredibly low barrier to entry.

    Hannes Mühleisen, Chief Executive Officer - DuckDB Labs
  • HighByte

    HighByte is an industrial software company addressing the data architecture and integration challenges faced by global manufacturers as they digitally transform. HighByte Intelligence Hub, the company’s proven Industrial DataOps software, provides modeled, ready-to-use data to AWS cloud services using a codeless interface to speed integration time and accelerate analytics.

    Amazon S3 Tables is a powerful new feature that optimizes the management, performance, and storage of tabular data for analytics workloads. HighByte Intelligence Hub’s direct integration with Amazon S3 Tables makes it easy for global manufacturers to build an open, transactional data lake for their industrial data. S3 Tables enable instant querying of raw Parquet data, allowing customers to send contextualized information from the edge to the cloud for immediate use without additional processing or transformations. This has a major impact on both performance and cost optimization for our mutual customers.

    Aron Semle, Chief Technology Officer - HighByte
  • PuppyGraph

    PuppyGraph is the first real-time, zero-ETL graph query engine, enabling data teams to query existing lakehouse as a graph in minutes—without costly migration or maintenance. It scales to petabyte-sized datasets and executes complex multi-hop queries in seconds, powering use cases from fraud detection to cybersecurity and AI-driven insights.

    Amazon S3 has long been the foundation of modern data infrastructure, and the launch of S3 Tables marks a major milestone—bringing Apache Iceberg closer to becoming the universal standard for data and AI. This innovation allows organizations to leverage high-performance, open table formats on S3, enabling multi-engine analytics without data duplication. For PuppyGraph customers, it means they can now run real-time graph queries directly on their S3 data, maintaining fresh, scalable insights without the overhead of complex ETL. We’re excited to be part of this evolution, making graph analytics as seamless as the data itself.

    Weimo Liu, Co-founder & CEO - PuppyGraph
  • Snowflake

    Snowflake makes enterprise AI easy, connected, and trusted. Thousands of companies around the globe, including hundreds of the world’s largest, use Snowflake’s AI Data Cloud to share data, build applications, and power their business with AI.

    We are excited to bring the magic of Snowflake to Amazon S3 Tables. This collaboration enables Snowflake customers to seamlessly read and process data stored in S3 Tables using their existing Snowflake setups, eliminating the need for complex data migrations or duplications. By combining Snowflake’s world-class performance analytics capabilities with Amazon S3 Tables’ efficient storage of Apache Iceberg tables, organizations can easily query and analyze tabular data stored in Amazon S3.

    Rithesh Makkena, Global Director of Partner Solutions Engineering - Snowflake
  • Starburst

    Starburst powers the foundational data architecture needed by analytics, AI, and data applications. It uses a hybrid data lakehouse environment powered by Apache Iceberg to deliver access, collaboration, and governance at scale.

    We’re thrilled to see Amazon S3 introduce built-in support for Apache Iceberg with S3 Tables, advancing the Iceberg Open Data Lakehouse ecosystem. With S3 table buckets, we look forward to collaborating with AWS to help our joint customers bring the power of an Open Lakehouse, powered by optimized Trino– a leading open source MPP SQL engine, across diverse analytics and AI use cases to data in Amazon S3.

    Matt Fuller, Vice President, Product - Starburst
  • StreamNative

    StreamNative is a messaging and streaming platform that powers AI and analytics with cost-effective, high-performance data ingestion. StreamNative Ursa engine enables enterprises to reduce total cost of ownership (TCO) by 90% with Kafka compatibility, a leaderless architecture and lakehouse-native storage, making AI-ready data accessible at scale.

    Our integration with Amazon S3 Tables makes real-time, AI-ready data more open and accessible than ever. Ursa’s leaderless architecture on S3 already reduces storage costs, and direct integration with S3 Tables further improves performance and efficiency. In an AI-driven world, data governance is crucial. At StreamNative, we’re committed to helping businesses reduce TCO by 90% while making it effortless and affordable to build AI-powered applications with governed, real-time data.

    Sijie Guo, CEO & Co-Founder - StreamNative