AWS Storage Blog

Category: Intermediate (200)

lakeFS and Amazon S3 Express One Zone: Highly performant data version control for ML/AI

Machine learning presents a number of new challenges to data teams, calling for technology solutions that can support training and fine-tuning performance-critical workloads with high performance. Data version control is one of the facets of high-performing ML pipelines, as it allows efficient experimentation and full ML pipeline reproducibility at scale. lakeFS by Treeverse, an AWS […]

ClickHouse Cloud & Amazon S3 Express One Zone: Making a blazing fast analytical database even faster

ClickHouse is a columnar database management system (DBMS) designed for blazing-fast real-time analytics. It was built to address the needs of interactive analytical applications requiring up-to-the-second analytics. To do that, it must support real-time data ingestion at the rate of hundreds of millions of events per second and run complex analytical queries, such as filtering, […]

Amazon S3 featured image 2023

Streamline data sharing and access control with Informatica Cloud Data Marketplace and Amazon S3 Access Grants

Organizations are modernizing their data lakes on Amazon Simple Storage Service (Amazon S3) to handle the ever-growing data volume and speed while meeting the demands of analytics, machine learning (ML), artificial intelligence (AI), and generative AI applications. To enable a data-driven culture and remain innovative, the data platform must allow for data-centric collaboration across business […]

Amazon S3 featured image - new

How to enforce Amazon S3 Access Grants with Immuta

Amazon Simple Storage Service (Amazon S3) is the most popular object storage platform for modern data lakes. Organizations today evolved to adopt a lake house architecture that combines the scalability and cost effectiveness of data lakes with the performance and ease-of-use of data warehouses. Likewise, Amazon S3 plays an increasingly important role as the foundational […]

Amazon S3 Archive Storage Classes

Simplify querying your archive data in Amazon S3 with Amazon Athena

Today, customers increasingly choose to store data for longer because they recognize its future value potential. Storing data longer, coupled with exponential data growth, has led to customers placing a greater emphasis on storage cost optimization and using cost-effective storage classes. However, a modern data archiving strategy not only calls for optimizing storage costs, but […]

AWS DataSync Featured Image 2020

Migrating Wasabi Object Storage to Amazon S3 using AWS DataSync

Many organizations find themselves faced with the task of transferring a substantial amount of object data between cloud service providers, there are various scenarios behind such transfers. These scenarios include data consolidation, workload migration, acquisition, and cost optimization efforts. Achieving a successful migration involves several crucial components, including comprehensive data encryption during transfer, the ability […]

How PingCAP transformed TiDB into a serverless DBaaS using Amazon S3 and Amazon EBS

PingCAP, an AWS Partner Network (APN) Partner, is the company behind TiDB, an advanced open-source, distributed SQL database for building modern applications. TiDB is widely used and trusted by technologists around the world. In July 2023, PingCAP released TiDB Serverless, a fully managed, autonomous DBaaS offering of TiDB. However, based on TiDB’s existing architecture, PingCAP […]

Amazon FSx for NetApp OnTAP

Optimizing electronic health care records at scale with Amazon FSx for NetApp ONTAP

Electronic Healthcare Records (EHR) applications are approaching a 40 billion dollar market size with a high compound annual growth rate. While continuing to focus on enabling innovative healthcare, EHR consumers can benefit from adopting cloud-based approaches that reduce operational burden, management overhead, reduce capital outlay, and total cost of ownership. EHR deployments are complex in […]

How to monitor Amazon Elastic File System (EFS) storage costs

Every organization should be seeking to optimize resource utilization, aiming for the highest efficacy at the lowest possible cost. To make effective data-driven cost optimization decisions, it’s essential to have relevant data, tools for generating reports, and clear documentation on how to do so. Amazon Elastic File System (Amazon EFS) offers serverless, scalable, and fully […]

AWS DataSync Featured Image 2020

Transferring data from Google Cloud Filestore to Amazon EFS using AWS DataSync

Organizations may need to transfer large numbers of files from one cloud provider to another for a variety of reasons like workload migration, disaster recovery, or a requirement to process data in other clouds. Data transfers typically require end-to-end encryption, the ability to detect changes, object validation, network throttling, monitoring, and cost optimization. Building a […]