AWS Storage Blog

Tag: Amazon S3 Data Lake

Amazon S3 featured image - new

How to develop a user-facing data application with IAM Identity Center and S3 Access Grants (Part 2)

This post is Part 2 of a two-part blog post series that will take you, an application developer, through the process of configuring and developing a data application that authenticates users with Microsoft Entra ID and then uses S3 Access Grants to access data on those users’ behalf. Part 1 of this series gave an […]

Amazon S3 featured image - new

How to develop a user-facing data application with IAM Identity Center and S3 Access Grants (Part 1)

This is Part 1 of a two-part blog series: Configuring the application. Here is Part 2: Developing the application. When we at AWS talk to our customers about their data lakes, they usually describe a desired access pattern in which users and groups from a corporate directory are granted access to datasets in Amazon Simple […]

Amazon S3 featured image - new

How to enforce Amazon S3 Access Grants with Immuta

Amazon Simple Storage Service (Amazon S3) is the most popular object storage platform for modern data lakes. Organizations today evolved to adopt a lake house architecture that combines the scalability and cost effectiveness of data lakes with the performance and ease-of-use of data warehouses. Likewise, Amazon S3 plays an increasingly important role as the foundational […]

Amazon S3 featured image - new

Scaling data access with Amazon S3 Access Grants

To adhere to the principle of least privilege, users define granular access to their Amazon Simple Storage Service (Amazon S3) data based on applications, personas, groups or organization units (OUs). This practice helps customers to mitigate the risk of unauthorized access, limiting potential damage in case of a security breach as employees only have access […]

AWS Backup 2021 blog image

Best practices for data lake protection with AWS Backup

Data lakes, powered by Amazon Simple Storage Service (Amazon S3), provide organizations with the availability, agility, and flexibility required for modern analytics approaches to gain deeper insights. Protecting sensitive or business-critical information stored in these S3 buckets is a high priority for organizations. AWS Backup for Amazon S3 makes it easier to centrally automate the […]

Amazon S3 featured image - new

How Arc XP lowered data transfer costs by $500k per year with Amazon CloudFront and Lambda@Edge on AWS

The Washington Post, an American daily newspaper company, delivers digital news content using Arc XP’s digital experience platform. Arc XP originated in The Post and has grown into a Software-as-a-Service (SaaS) business used by publishers, broadcasters, and brands to create, host, and monetize engaging content for over 1,500 websites globally. Photo Center is an Arc […]

S3 Intelligent-Tiering

Automatically archive and restore data with Amazon S3 Intelligent-Tiering

Customers of all sizes, in all industries, are using data lakes to transform data from a cost that must be managed, to a business asset. From time to time, data scientists and business analysts need to restore subsets of historical datasets for longitudinal studies, machine learning retraining, and more. However, users commonly write queries that don’t […]

See what’s in store for Amazon S3 at AWS re:Invent 2020-2021

UPDATE 9/8/2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. This time last year, the AWS Storage services and product marketing teams were entrenched in Las Vegas feverishly putting the final touches on content for re:Invent 2019 launches, sessions, workshops, and building makeshift workstations in a hotel ballroom for the biggest […]

Amazon S3

How Zalando built its data lake on Amazon S3

Founded in 2008, Zalando is Europe’s leading online platform for fashion and lifestyle with over 32 million active customers. I am a lead data engineer at Zalando and a steady contributor to the company’s cloud journey. In this blog post, I cover how Amazon Simple Storage Service (Amazon S3) became a cornerstone of the data […]

Migrate HDFS files to an Amazon S3 data lake with AWS Snowball Edge

The need to store newly connected data grows as the sources of data increase. Enterprise customers use Hadoop Distributed File System (HDFS) as their data lake storage repository for on-premises Hadoop applications. Customers are migrating their data lakes to AWS for a more secure, scalable, agile, and cost-effective solution. For HDFS migrations where high-speed transfer […]