AWS Storage Blog
Tag: Amazon Simple Storage Service (Amazon S3)
Analytical processing of millions of cell images using Amazon EFS and Amazon S3
Analytical workloads such as batch processing, high performance computing, or machine learning inference often have high IOPS and low latency requirements but operate at irregular intervals on subsets of large datasets. Typically, data is manually copied between storage tiers in preparation of processing, which can be cumbersome and error-prone. Given this, IT teams want to […]
Allowing external users to securely and directly upload files to Amazon S3
Organizations are often required to store files, images, and other digital assets in a repository. In many cases, the source of these files are partners or individuals who are not connected to internal systems and requires corporate authentication in order to upload the files. Customers traditionally use servers to handle file uploads, which can use […]
How Trend Micro uses Amazon S3 Object Lambda to help keep sensitive data secure
Does your application handle data that is uploaded by hundreds of thousands of end users? Is that same underlying data then shared across the same magnitude of users? Being able to scan data for malware before it’s returned to an application helps keep sensitive data secure, provides protection regardless of when the data was initially […]
Compressing and archiving logs to the Amazon S3 Glacier storage classes
In distributed architectures, there is often a need to preserve application logs, and for AWS customers preservation is often done via an Amazon S3 bucket. The logs may contain information on runtime transactions, error/failure states, or application metrics and statistics. These logs are later used in business intelligence to provide useful insights and generate dashboards, […]
Using AWS DataSync to move data from Hadoop to Amazon S3
You want to leverage cloud scalability, increase cost efficiency by paying only for utilized storage, decouple big data storage from processing, and increase capabilities for data analytics and machine learning using AWS. But how do you move your Hadoop cluster? To accelerate this transition, AWS DataSync recently launched support for moving data between Hadoop Distributed […]
A gene-editing prediction engine with iterative learning cycles built on AWS
NRGene develops cutting-edge genomic analytics products that are reshaping agriculture worldwide. Among our customers are some of the biggest and most sophisticated companies in seed-development, food and beverages, paper, rubber, cannabis, and more. In the middle of 2020, NRGene joined a consortium of companies and academic institutions to build the best-in-class gene-editing prediction platform to […]
MemQ by Pinterest: An efficient, scalable, cloud-native publish/subscribe system
The Logging Platform at Pinterest powers all data ingestion and transportation at Pinterest. At the heart of the Pinterest Logging Platform are distributed pub/sub systems that help our customers transport, buffer, and consume data asynchronously. Pub/sub messaging, is a form of asynchronous service-to-service communication used in serverless and microservices architectures. In a pub/sub model, any […]
How to securely share application log files with third parties
What do we do when our applications fail, and we must provide instance-level log data to external entities for troubleshooting purposes? It’s best to limit direct human interaction with our production resources, so we often see temporary access provided for a fixed period. For highly regulated industries, the approval process for production access can be […]
Modernizing NASCAR’s multi-PB media archive at speed with AWS Storage
The National Association for Stock Car Auto Racing (NASCAR) is the sanctioning body for the No. 1 form of motor sports in the United States, and owns 15 of the nation’s major motorsports entertainment facilities. About 15 years ago NASCAR began to collect all the video, audio, and image assets from over the last 70+ […]
Considering four different replication options for data in Amazon S3
UPDATE (2/10/2022): Amazon S3 Batch Replication launched on 2/8/2022, allowing you to replicate existing S3 objects and synchronize your S3 buckets. See the S3 User Guide for additional details. As your business grows and accumulates more data over time, you may need to replicate data from one system to another, perhaps because of company security […]