AWS Storage Blog
Find out the size of your Amazon S3 buckets
Monitoring key storage metrics is an important part of most enterprise data governance strategies, for customers of any size. Data and analytics leaders also keenly focus on cost-optimized data management as they map out their organizations’ digital futures. A robust data management and monitoring strategy can unlock several advantages for customers laying the foundation for their overall cost optimization strategy, like providing visibility on key metrics that can help clarify the ideal data-management methods for your organization.
Amazon Simple Storage Service is an object storage service that offers industry-leading scalability, data availability, security, and performance. Amazon S3 bucket is a container for objects, aka files. This means customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases, such as data lakes, websites, cloud-native applications, backups, archive, machine learning, and analytics.
The total volume of data and number of objects you can store in an Amazon S3 bucket is virtually unlimited. AWS provides various tools that you can use to monitor S3 storage size and other key usage metrics. In this blog, we walk you through six different methods to find the storage size of a single Amazon S3 bucket or all S3 buckets spread across different regions in your AWS account:
- Using the Amazon S3 console
- Using S3 Storage Lens
- Using Amazon CloudWatch
- Using Amazon S3 Inventory
- Using AWS Command Line Interface
- Using a custom script (for real-time S3 bucket storage size)
With these methods, monitoring a basic key storage metric, the amount of data stored, is quick work, making your storage-management operation that much easier!
1. Using the Amazon S3 console
To find the size of a single S3 bucket, you can use the S3 console and select the bucket you wish to view. Under Metrics, there’s a graph that shows the total number of bytes stored over time.
2. Using S3 Storage Lens
S3 Storage Lens is a tool that provides a single-pane-of-glass visibility of storage size and 29 usage and activity metrics across your Amazon S3 storage. With S3 Storage Lens, you get visibility at the AWS organizations or AWS account level, with drill-downs by region, storage class, bucket, and prefix.
Amazon S3 Storage Lens has a default dashboard called default-account-dashboard. This default dashboard is free for customers and displays metrics that are up to 14 days old. You can also upgrade to receive advanced metrics and recommendations including usage metrics with prefix-level aggregation and those activity metrics are available for analysis up to 15 months. You can also create S3 Storage Lens custom dashboards that can be scoped to cover your AWS organizations, or to specific regions or buckets within an account.
The following S3 Storage Lens dashboard screenshot shows all the S3 buckets in an AWS Organizations, arranged based on the storage size. You can choose to export these metrics to a destination bucket every 24 hours in a CSV (comma-separated values) or Apache Parquet format. From there, you can use a preferred analytics tools of choice such as Amazon Athena or Amazon QuickSight for detailed analysis.
3. Using Amazon CloudWatch
Amazon CloudWatch metrics for Amazon S3 can help you find the size of an S3 bucket. The CloudWatch metric ‘BucketSizeBytes’ is the amount of data in bytes stored in an S3 bucket. This value is calculated by summing the size of all objects and metadata in the bucket (both current and noncurrent objects), including the size of all parts for all incomplete multipart uploads to the bucket. This metric for Amazon S3 is reported once per day to CloudWatch. Amazon S3 sends several other storage metrics to CloudWatch and you can find the entire list here.
4. Using Amazon S3 inventory
Amazon S3 inventory is one of the tools Amazon S3 provides to help manage your storage. Customers can create inventory configurations on an S3 bucket to generate a flat file list of objects and metadata. You can choose to report this data on a daily or weekly basis. These scheduled reports can include all objects in the bucket or be limited to a specific prefix. The inventory list contains a list of objects in an S3 bucket and metadata for each listed object, including the object size in bytes. The inventory list can be stored in a destination bucket of your choice. You can create inventory configuration by navigating to an S3 bucket Management -> Inventory configurations -> Create inventory configuration. From there, you can use a preferred analytics tools of choice such as Amazon Athena or Amazon QuickSight for detailed analysis.
5. Using AWS Command Line Interface
AWS Command Line Interface (AWS CLI) is an open source tool that enables you to interact with AWS services using commands in your command-line shell and manage your AWS services. Amazon S3 provides AWS CLI tools to interact and manage the S3 objects. To find size of a single S3 bucket, you can use the following command, which summarizes all prefixes and objects in an S3 bucket and displays the total number of objects and total size of the S3 bucket.
aws s3 ls --summarize --human-readable --recursive s3://<bucket-name>/
The following is a sample output.
2021-10-07 21:32:57 452 Bytes foo/bar/car/petrol 2021-10-07 21:32:57 896 Bytes foo/bar/truck/diesel 2021-10-07 21:32:57 189 Bytes foo/bar/hybrid/battery 2021-10-07 21:32:57 398 Bytes vehicles.txt Total Objects: 4 Total Size: 2.9 MiB
You can find additional details about how to manage Amazon S3 buckets and objects using the AWS CLI S3 commands here.
6. Using a custom script (for real-time S3 bucket storage size)
If you want to generate an on-demand real time S3 bucket storage size report of all S3 buckets spread across different regions in an AWS account, you can use the following script.
Solution overview
For this walkthrough, you need the following:
- An AWS account
- AWS IAM user/role with access to Amazon S3 resources
- AWS CLI version 2
Walkthrough
At a high level, the steps here can be summarized as follows:
- Get list of S3 buckets from all regions in an AWS Account
- Find the amount of data in bytes stored in each bucket
- Output to a CSV file for easy consumption
Find size of all S3 buckets in an AWS account
The following script will help you find size of all S3 buckets spread across different regions in an AWS account. You need to configure AWS CLI to execute the script.
#!/usr/bin/bash set +x PROFILE=<your AWS profile> function calcs3bucketsize() { sizeInBytes=`aws --profile ${PROFILE} s3 ls s3://"${1}" --recursive --human-readable --summarize | awk END'{print}'` echo ${1},${sizeInBytes} >> allregions-buckets-s3-sizes.csv printf "DONE. Size of the bucket ${1}. %s\n " "${sizeInBytes}" } [ -f allregions-buckets-s3-sizes.csv ] && rm -fr allregions-buckets-s3-sizes.csv buckets=`aws --profile ${PROFILE} s3 ls | awk '{print $3}'` i=1 for j in ${buckets}; do printf "calculating the size of the bucket[%s]=%s. \n " "${i}" "${j}" i=$((i+1)) # to expedite the calculation, make the cli commands run parallel in the background calcs3bucketsize ${j} & done
Run the script and redirect the output to a CSV file.
The following is a sample CSV output.
1_bucket_name |
Total Size: 349.7 MiB |
carbucket-data |
Total Size: 524.2 MiB |
hybridbucket-trip |
Total Size: 247.8 MiB |
truckbucket-data |
Total Size: 845.6 MiB |
wholesalecar_revenue |
Total Size: 700.7 MiB |
wholesaletruck_revenue |
Total Size: 600.5 MiB |
auto_bucket_pilot |
Total Size: 423.9 MiB |
corp_mybucket |
Total Size: 320.7 MiB |
destinationbucket123 |
Total Size: 920.4 MiB |
tagpolicy_bucket |
Total Size: 800.3 MiB |
Conclusion
In this blog post, we walked you through six different methods namely, using Amazon S3 Console, Amazon S3 Storage Lens, Amazon CloudWatch, Amazon S3 inventory, AWS Command Line Interface, and a custom script to find the storage size of a single Amazon S3 bucket or all buckets spread across different regions in your AWS account. Customers of any size can choose these methods or a combination that suits their use-cases to establish a robust data management and monitoring strategy that lays the foundation for overall cloud storage cost optimization.
Amazon S3 can be managed via AWS Management Console, AWS Command Line Interface (CLI), and through the AWS SDKs. For further reading, refer to AWS Well-Architected Framework, Architecture Best Practices for Storage and AWS Storage Optimization. We are here to help, and if you need further assistance in developing a successful cloud storage optimization strategy, please reach out to AWS Support and your AWS account team.