AWS Storage Blog

Author: Aritra Gupta

Aritra Gupta is a Senior Technical Product Manager on the Amazon S3 team at Amazon Web Services. He helps customers build and scale data lakes. Based in Seattle, he likes to play chess and badminton in his spare time.

Connect Snowflake to S3 Tables using the SageMaker Lakehouse Iceberg REST endpoint

Organizations today seek data analytics solutions that provide maximum flexibility and accessibility. Customers need their data to be readily available using their preferred query engines, and break down barriers across different computing environments. At the same time, they want a single copy of data to be used across these solutions, to track lineage, be cost […]

Build a managed transactional data lake with Amazon S3 Tables

UPDATE (12/19/2024): Added guidance for Amazon EMR setup. Customers commonly use Apache Iceberg today to manage ever-growing volumes of data. Apache Iceberg’s relational database transaction capabilities (ACID transactions) help customers deal with frequent updates, deletions, and the need for transactional consistency across datasets. However, getting the most out of Apache Iceberg tables and running it […]

Optimizing performance of Apache Spark workloads on Amazon S3

This blog covers performance metrics, optimizations, and configuration tuning specific to OSS Spark running on Amazon EKS. For customers using or considering Amazon EMR on EKS, refer to the service documentation to get started and this blog post for the latest performance benchmark. Performance is top of mind for customers running streaming, extract transform load […]

AWS Storage Blog

Author: Aritra Gupta

Connect Snowflake to S3 Tables using the SageMaker Lakehouse Iceberg REST endpoint

Build a managed transactional data lake with Amazon S3 Tables

Optimizing performance of Apache Spark workloads on Amazon S3

Learn

Resources

Developers

Help