AWS Storage Blog

Category: Storage

lakeFS and Amazon S3 Express One Zone: Highly performant data version control for ML/AI

Machine learning presents a number of new challenges to data teams, calling for technology solutions that can support training and fine-tuning performance-critical workloads with high performance. Data version control is one of the facets of high-performing ML pipelines, as it allows efficient experimentation and full ML pipeline reproducibility at scale. lakeFS by Treeverse, an AWS […]

ClickHouse Cloud & Amazon S3 Express One Zone: Making a blazing fast analytical database even faster

ClickHouse is a columnar database management system (DBMS) designed for blazing-fast real-time analytics. It was built to address the needs of interactive analytical applications requiring up-to-the-second analytics. To do that, it must support real-time data ingestion at the rate of hundreds of millions of events per second and run complex analytical queries, such as filtering, […]

Amazon S3 featured image 2023

Streamline data sharing and access control with Informatica Cloud Data Marketplace and Amazon S3 Access Grants

Organizations are modernizing their data lakes on Amazon Simple Storage Service (Amazon S3) to handle the ever-growing data volume and speed while meeting the demands of analytics, machine learning (ML), artificial intelligence (AI), and generative AI applications. To enable a data-driven culture and remain innovative, the data platform must allow for data-centric collaboration across business […]

Amazon S3 featured image - new

How to develop a user-facing data application with IAM Identity Center and S3 Access Grants (Part 2)

This post is Part 2 of a two-part blog post series that will take you, an application developer, through the process of configuring and developing a data application that authenticates users with Microsoft Entra ID and then uses S3 Access Grants to access data on those users’ behalf. Part 1 of this series gave an […]

Amazon S3 featured image - new

How to develop a user-facing data application with IAM Identity Center and S3 Access Grants (Part 1)

This is Part 1 of a two-part blog series: Configuring the application. Here is Part 2: Developing the application. When we at AWS talk to our customers about their data lakes, they usually describe a desired access pattern in which users and groups from a corporate directory are granted access to datasets in Amazon Simple […]

Amazon S3 featured image - new

Accelerate Amazon S3 throughput with the AWS Common Runtime

Data is at the center of every machine learning pipeline. Whether pre-training foundation models (FMs), fine-tuning FMs with business-specific data, or serving inference queries, every step of the machine learning lifecycle needs low-cost, high-performance data storage to keep compute resources busy and performing useful work. Customers use Amazon Simple Storage Service (Amazon S3) to store training data […]

Amazon S3 featured image - new

How to enforce Amazon S3 Access Grants with Immuta

Amazon Simple Storage Service (Amazon S3) is the most popular object storage platform for modern data lakes. Organizations today evolved to adopt a lake house architecture that combines the scalability and cost effectiveness of data lakes with the performance and ease-of-use of data warehouses. Likewise, Amazon S3 plays an increasingly important role as the foundational […]

Amazon S3 featured image - new

Scaling data access with Amazon S3 Access Grants

To adhere to the principle of least privilege, users define granular access to their Amazon Simple Storage Service (Amazon S3) data based on applications, personas, groups or organization units (OUs). This practice helps customers to mitigate the risk of unauthorized access, limiting potential damage in case of a security breach as employees only have access […]

Amazon S3 Archive Storage Classes

Simplify querying your archive data in Amazon S3 with Amazon Athena

Today, customers increasingly choose to store data for longer because they recognize its future value potential. Storing data longer, coupled with exponential data growth, has led to customers placing a greater emphasis on storage cost optimization and using cost-effective storage classes. However, a modern data archiving strategy not only calls for optimizing storage costs, but […]

Amazon FSx for Lustre

Use Amazon FSx for Lustre to share Amazon S3 data across accounts

As enterprises evolve their cloud governance practices, multiple teams working in separate accounts may need to share data. One team may oversee an enterprise data lake in one account, while a data science team develops a high-performance computing (HPC) use case in another account. Customers want to take advantage of low-cost object storage and be […]