AWS Storage Blog

Category: Analytics

AWS DataSync Featured Image 2020

Derive insights from AWS DataSync task reports using AWS Glue, Amazon Athena, and Amazon QuickSight

Update (10/30/2024): On October 30, 2024, AWS DataSync launched Enhanced mode tasks, prompting updates to this blog. Updates include a new step in the “Step 2: Populate Glue catalog with task reports data using a Glue crawler” section and detailed information on the new capabilities in “Updated steps for working with task reports of new […]

Amazon S3 Object Lambda

Access a point in time with Amazon S3 Object Lambda

Point-in-time ‘snapshots’ enable administrators, developers, testers, and end users to quickly access a storage volume or share how it was at an earlier point-in-time. They are a longstanding approach to data protection and recovery, tracking changes within a storage system to reduce both Recovery Point Objective (RTO) and Recovery Time Objective (RTO). However, traditional snapshots […]

Amazon S3 Express One Zone thumbnail

How Vivian Health is using Amazon S3 Express One Zone to accelerate healthcare hiring

Vivian Health connects travel nurses with job opportunities across the country. To do that, the platform has innovated not just the job search itself, but also the tooling used by recruiters and hiring managers to get qualified candidates matched to the right job and placed as quickly and as seamlessly as possible. However, the process […]

Amazon S3 featured image 2023

Use generative AI to query your Amazon S3 data lake for insights

Businesses store large volumes of data in their data lakes and rely on this data to extract insights and make important business decisions. However, business stakeholders sometimes lack the technical skills required to run complex queries against their data lakes. Instead, they rely on data scientists or analysts to build reports and dashboards or to […]

AWS Backup 2021 blog image

Streamline and automate compliance monitoring and reporting with AWS Backup Audit Manager

Organizations meet business and regulatory requirements by having visibility and control over backup environments. You want a streamlined solution to continuously monitor, detect, and track policy drifts across your backup deployments at scale. This need is driven by the growing complexity of AWS environments, the proliferation of data across diverse AWS services and regions, and […]

Amazon S3 featured image 2023

Siemens builds Datalake2Go on AWS to analyze disparate data globally

Siemens is a technology company focused on industry, infrastructure, transport, and healthcare. From resource-efficient factories, resilient supply chains, and smart buildings and grids, to cleaner and more comfortable transportation and advanced healthcare, the company creates technology with purpose, adding real value for its customers. Siemens technology is everywhere, supporting the critical infrastructure and vital industries […]

Amazon S3 Object Lock

Maintaining object immutability by automatically extending Amazon S3 Object Lock retention periods

Protecting against accidental or malicious deletion is a key element of data protection. Immutability protects data in-place, preventing unintended changes or deletions. However, sometimes it isn’t clear for how long data should be made immutable. Users in this situation are looking for a solution that maintains short-term immutability, indefinitely. They want to make sure their […]

Amazon S3 featured image 2023

Understand Amazon S3 data transfer costs by classifying requests with Amazon Athena

Cost is top of mind for many enterprises, and building awareness of different cost contributors is the first step toward managing costs and improving efficiency. Costs for transferring data may segregate into common but low cost and less frequent but higher cost groups. Data about these two groups is mixed together, and separating them enables […]

Amazon S3 featured image 2023

Managing duplicate objects in Amazon S3

When managing a large volume of data in a storage system, it is common for data duplication to happen. Data duplication in data management refers to the presence of multiple copies of the same data within your system, leading to additional storage usage as well as extra overhead when handling multiple copies of the same […]

Amazon S3 featured image 2023

Automatic monitoring of actions taken on objects in Amazon S3

Administrators may need to monitor and audit actions, like uploads, updates, and deletes, taken on files and other data to comply with regulations or company policies. A scalable and reliable method of tracking and saving actions taken on files can reduce manual work and operational overhead while helping to ensure compliance. An event-based fanout architectures […]