AWS Storage Blog
Tag: Amazon Athena
Simplify querying your archive data in Amazon S3 with Amazon Athena
Today, customers increasingly choose to store data for longer because they recognize its future value potential. Storing data longer, coupled with exponential data growth, has led to customers placing a greater emphasis on storage cost optimization and using cost-effective storage classes. However, a modern data archiving strategy not only calls for optimizing storage costs, but […]
Getting visibility into storage usage in multi-tenant Amazon S3 buckets
SaaS providers with multi-tenant environments use cloud solutions to dynamically scale their workloads as customer demand increases. As their cloud footprint grows, having visibility into each end-customer’s storage consumption becomes important to distribute resources accordingly. An organization can use storage usage data per customer (tenant) to adjust its pricing model or better plan its budget. […]
Consolidate and query Amazon S3 Inventory reports for Region-wide object-level visibility
Organizations around the world store billions of objects and files representing terabytes to petabytes of data. Data is often owned by different teams, departments, or business units, spanning multiple locations. As the amount of datastores, locations, and owners grow, you need a way to cost-effectively maintain visibility on important characteristics of your data, including based […]
Identify cold objects for archiving to Amazon S3 Glacier storage classes
Many organizations move cold data to archive storage in the cloud to optimize storage costs for data they want to preserve over a number of years. Archiving data at a very low cost also gives organizations the ability to quickly restore that data and put it to work for their business, such as for historical […]
Derive insights from AWS DataSync task reports using AWS Glue, Amazon Athena, and Amazon QuickSight
Update (9/22/2023): Step 6b updated to automatically detect and update the Amazon Athena table schema when crawler detects large data transfer values reported in bytes that would consume the table’s maximum integer value while storing data. As customers scale their migration of large datasets with millions of files across multiple data transfers, they are faced […]
Migrate on-premises data to AWS for insightful visualizations
When migrating data from on premises, customers seek a data store that is scalable, durable, and cost effective. Equally as important, BI must support modern, interactive, and fast dashboards that can scale to tens of thousands of users seamlessly while providing the ability to create meaningful data visualizations for analysis. Visualization of on-premises business analytics […]
Disabling ACLs for existing Amazon S3 workloads with information in S3 server access logs and AWS CloudTrail
Access control lists (ACLs) are permission sets that define user access, and the operations users can take on specific resources. Amazon S3 was launched in 2006 with ACLs as its first authorization mechanism. Since 2011, Amazon S3 has also supported AWS Identity and Access Management (IAM) policies for managing access to S3 buckets, and recommends using […]
Using presigned URLs to identify per-requester usage of Amazon S3
Many software-as-a-service (SaaS) product offerings have a pay-as-you-go pricing model, charging customers only for the resources consumed. However, a pay-as-you-go pricing is only viable when you can accurately track each customer’s use of resources, such as compute capacity, storage, and networking bandwidth. Without this data, SaaS providers do not have visibility into resource consumption of […]
Restore data from Amazon S3 Glacier storage classes starting with partial object keys
When managing data storage, it is important to optimize for cost by storing data in the most cost-effective manner based on how often data is used or accessed. For many enterprises, this means using some form of cold storage or archiving for data that is less frequently accessed or used while keeping more frequently used […]
Optimize storage costs by analyzing API operations on Amazon S3
The demand for data storage has increased with the advent of a fast-paced data environment – creating, sharing, and replicating data at a large scale. Most organizations are looking for the optimal way to store their data cost-effectively, giving them everything they need from their data but without breaking the bank. Cloud storage provides flexible […]