AWS Storage Blog
Category: Application Services
How London Stock Exchange Group migrated 30 PB of market data using AWS DataSync
London Stock Exchange Group (LSEG) has 30 PB of Tick History-PCAP data, which is ultra-high-quality global market data that is based on raw exchange data, timestamped to the nanosecond. An additional 60 TB is generated every day. LSEG sought to migrate their data from Wasabi cloud storage, LSEG was looking for a new solution to […]
Optimizing storage costs and query performance by compacting small objects
Applications produce log files that should be reliably stored for ad-hoc reporting, compliance, or auditing purposes. Over time, these collections of relatively small log files grow in volume and cost-effective storage and data management becomes crucial. Accessing the data in these files and querying them can also be useful for getting insight from the data. […]
How Visual Layer builds high quality datasets on Amazon S3
Companies from different industries use data to help their Artificial Intelligence (AI) and Machine Learning (ML) systems make intelligent decisions. For ML systems to work well, it is crucial to make sure that the massive datasets used for training ML models are of the highest quality, minimizing noise that can contribute to less-than-optimal performance. Processing […]
How to restore archived Amazon EC2 backup recovery points from the Amazon S3 Glacier storage classes
This is the second post in a two-part series. In part one, we described a process to automatically archive Amazon EC2 backup recovery points from AWS Backup to an Amazon S3 bucket in one of the Amazon S3 Glacier storage classes. In this post, we describe the process to restore an archived EC2 backup recovery point from […]
How to archive Amazon EC2 backup recovery points to Amazon S3 Glacier storage classes
Centralizing and automating data protection helps you support your business continuity and regulatory compliance goals. Centralized data protection and enhanced visibility across backup operations can reduce the risks of disasters, improve business continuity, and simplify the auditing process. Many organizations have requirements to retain backups of their compute instances for a certain time based on […]
Automating AWS Backup pre- and post-script execution with AWS Step Functions
Customers execute custom scripts before or after a backup job to automate and orchestrate required and repetitive tasks. For example, customers running applications hosted in Amazon Elastic Compute Cloud (EC2) instances use scripts to complete application transactions, flush the buffers and caches, stop file I/O operations, or ensure that the application is idle, bringing the […]
Troubleshooting automated pre- and post-scripts for AWS Backup
Customers can use event-driven architectures with decoupled tasks to automate and orchestrate custom scripts for backup jobs. With event-driven architectures, troubleshooting is key to understanding failures at the component levels in order to resolve issues that arise and keep the entire automated workflow running smoothly. In the first post in this two-part blog series, we […]
Authenticating to AWS Transfer Family with Azure Active Directory and AWS Lambda
Note (5/11/2023): The sample solution provided in this blog post does not support Multi-Factor Authentication (MFA) with Azure Active Directory. Managing users at scale across multiple systems can become a time-intensive process, adding undue burden to system administrators. User management is increasingly complex when customers operate file transfer workloads that share data across different internal […]
Automating disaster recovery of Amazon RDS and Amazon EC2 instances
Complex environments can sometimes feel like they require complex disaster recovery (DR) solutions, which usually consist of multiple DR offerings from different vendors that may not interact with each other. There are many ways to build a DR solution in the cloud. Luckily, with AWS, you can easily configure multiple DR services and orchestrate them […]
Analytical processing of millions of cell images using Amazon EFS and Amazon S3
Analytical workloads such as batch processing, high performance computing, or machine learning inference often have high IOPS and low latency requirements but operate at irregular intervals on subsets of large datasets. Typically, data is manually copied between storage tiers in preparation of processing, which can be cumbersome and error-prone. Given this, IT teams want to […]