AWS Storage Blog
DoorDash saves millions annually using Amazon S3 Storage Lens
DoorDash connects consumers with their favorite local businesses in more than 25 countries across the globe. A born-in-the-cloud company, DoorDash hosts the majority of its infrastructure, including its commerce platform, on AWS. This platform powers the ecosystem of customers placing orders, merchants fulfilling orders, and dashers performing deliveries. DoorDash’s platform was built to achieve the highest level of scale and performance to support a rapidly expanding business. The commerce platform accesses a large data lake that includes heterogenous event data, images used in merchant catalogs, photos taken of delivered goods, real-time geolocation data, and other order-specific data objects.
As the scale of DoorDash order volumes grew, so did the cost of their data lake on Amazon S3, which they chose for storage because they needed a highly durable and available storage solution that is also cost effective. The Cloud Financial Operations (FinOps) Team at DoorDash has a mission to make sure of efficient usage of cloud resources. The team identified their data storage as a source of accelerating costs. The Cloud FinOps team needed a scalable observability solution that would monitor Amazon S3 storage usage and activity and surface optimization opportunities. They chose S3 Storage Lens for this purpose, an S3 feature that provides organization-wide visibility into object storage usage, activity trends, and actionable recommendations to optimize costs and apply data protection best practices.
In this post, we discuss the storage optimization strategies that DoorDash implemented by using the insights generated by using S3 Storage Lens Advanced metrics. We also detail the steps you can use to surface these insights using S3 Storage Lens. With the S3 Storage Lens insights and storage optimization strategies discussed in this post, DoorDash was able to grow their Amazon S3 usage by 2.6X while reducing unit costs by 30%, saving millions of dollars annually.
“Once we decided to dig deeper into our Amazon S3 usage to identify efficiency opportunities to support future growth, we started our investigation using S3 Storage Lens with the advanced metrics and recommendations. We found the ability to analyze storage usage and activity patterns at an S3 bucket, prefix, and storage class level invaluable in identifying opportunities to optimize costs. Without S3 Storage Lens, it would have taken much more time to examine and visualize Amazon S3 usage, and to improve our storage unit economics.”
– Levon Stepanian, Cloud FinOps Lead, DoorDash
What is Amazon S3 Storage Lens?
S3 Storage Lens is the first cloud storage analytics solution to provide a single view of object storage usage and activity across hundreds, or even thousands, of accounts in an organization. It has drill-down capabilities to generate insights at multiple aggregation levels. S3 Storage Lens delivers more than 60 metrics on Amazon S3 storage usage and activity to an interactive dashboard in the Amazon S3 console. At no additional cost, all Amazon S3 users can access an interactive S3 Storage Lens dashboard in the Amazon S3 console containing pre-configured views to visualize storage trends. By default, S3 Storage Lens provides 28 metrics across various categories at the bucket level, and 14 days of historical data in the dashboard.
By upgrading to Storage Lens Advanced, you gain 35 additional metrics and 15 months of historical data. These additional metrics can be used to inform a variety of optimization steps, such as expanding S3 Lifecycle policy usage for greater cost optimization, identifying buckets with insufficient data protection policies, and improving the performance of your application workloads. Along with additional metrics, Storage Lens Advanced provides prefix-level aggregation, Amazon CloudWatch metrics support, and custom object metadata filtering with S3 Storage Lens groups.
Cost optimization opportunities identified with S3 Storage Lens
With S3 Storage Lens, DoorDash was able to identify the optimal storage class for their largest buckets and also ensure optimal application of S3 Lifecycle rules.
Opportunity 1: Optimal storage class selection for the largest buckets
Using the Overview tab in S3 Storage Lens, the Cloud FinOps team at DoorDash was able to quickly identify the top buckets and prefixes responsible for the majority of their storage usage. With this knowledge, the team audited the Amazon S3 storage class configuration for each bucket without having to review bucket configurations stored in their source code repositories or Infrastructure-as-Code (IaC). The team could also review and visualize activity (such as counts for PUT, GETs, etc.) using S3 Storage Lens activity metrics. Using this data, the team was able to quickly identify buckets that could be moved to more optimal storage classes.
Figure 1: An S3 bucket that had compression enabled, decelerating storage growth
Once combined with cost data, the Cloud FinOps team identified the optimal storage class for each bucket, reached out to stakeholder teams to confirm the proposed storage class changes, and applied them. S3 Intelligent-Tiering was selected for buckets with indeterminate access patterns, and S3 Glacier Flexible Retrieval was selected for write once, read rarely data. Some teams were also able to modify their client code to compress objects before writing to Amazon S3, and directly write objects into S3 Intelligent-Tiering (thereby avoiding S3 Lifecycle transition costs).
Figure 2: An S3 bucket that was transitioned from S3 Standard to S3 Glacier Flexible Retrieval improving unit costs
In general, you can understand the consistency of access patterns and spot buckets that are no longer being accessed using activity metrics. To see activity metrics in your S3 Storage Lens dashboard, you must enable Activity metrics in your dashboard configuration. The easiest way to find buckets that have gone cold is to use the Bubble analysis by buckets section in the Buckets tab of your Storage Lens dashboard. Choose the Total storage, % retrieval rate, and Average object size metrics for the for the X-axis, Y-axis, and Size respectively, in your bubble chart. Look for any buckets with retrieval rates of zero (or near zero) and a larger relative storage size. From here, you can identify the owners of cold buckets in your account or organization and find out if that storage is still needed. Then, you can optimize costs by configuring S3 Lifecycle expiration configurations for these buckets or archiving the data in one of the Amazon S3 Glacier storage classes. To automate this process going forward, you can automatically transition your data by using S3 Lifecycle configurations for your buckets, or you can enable auto-archiving with S3 Intelligent-Tiering. You can also use this approach to identify hot buckets. Then, you can make sure that these buckets use an S3 storage class that serves their requests most effectively in terms of performance and cost.
Figure 3: Using the Bubble analysis chart to identify buckets that have gone cold
Opportunity 2: Identify buckets that are missing S3 Lifecycle rules
After addressing some of the ‘low-hanging’ optimization opportunities, the Cloud FinOps team wanted to dig deeper to proactively make sure the consistent application of storage best practices. To gain deeper visibility into their data lake storage, DoorDash enabled the cost optimization metrics category available with S3 Storage Lens Advanced metrics. S3 Storage Lens provides S3 Lifecycle rule count metrics that you can use to identify buckets that are missing S3 Lifecycle rules, as shown in the preceding figure. These cost optimization metrics provided the Cloud FinOps team with a scalable way to identify buckets that did not follow storage optimization best practices, such as using S3 Lifecycle configurations. An S3 Lifecycle configuration is a set of policies that define actions that Amazon S3 applies to a group of objects. For example, DoorDash wanted buckets that had a large share of incomplete multipart uploads to have an S3 Lifecycle policy that automatically expired these objects after seven days. Similarly, they wanted buckets that had Amazon S3 versioning enabled to have an S3 Lifecycle policy that transitioned non-current versions of objects to an archive storage class. Using the S3 Storage Lens cost optimization metrics, DoorDash could easily compile a list of all buckets that were missing these rules. The team reached out to these bucket owners to validate the findings and implement changes that made sure of the consistent application of Amazon S3 storage optimization best practices.
Figure 4: Implementation of prefix-level retention policy to eliminate unnecessary storage
To see S3 Lifecycle rule count metrics in your S3 Storage Lens dashboard, you must enable Advanced cost optimization metrics in your dashboard configuration. A bucket with no S3 Lifecycle configuration might have storage that you no longer need or can migrate to a lower-cost storage class. You can also use S3 Lifecycle rule count metrics to identify buckets that are missing specific types of S3 Lifecycle rules, such as expiration or transition rules. To find buckets that don’t have S3 Lifecycle rules, you can use the Total buckets without S3 Lifecycle rules metric. To identify the specific buckets, you can use the Top N overview section in your S3 Storage Lens dashboard. By default, the Top N overview section displays metrics for the top three buckets. In the Top N field, you can increase the number of buckets. The Top N overview section also shows the percentage change from the prior day or week and a spark-line to visualize the trend. This trend is a 14-day trend for free metrics and a 30-day trend for advanced metrics and recommendations.
Using this approach, you can see the accounts that have the most buckets without S3 Lifecycle rules, view a breakdown of buckets without S3 Lifecycle rules by AWS Region, and see which buckets don’t have S3 Lifecycle rules. S3 Storage Lens also enables you to drill deeper, such as to deeper levels of aggregation such as Account, AWS Region, Storage Class, or Bucket. Furthermore, you can identify specific S3 Lifecycle rules such as Transition lifecycle rule count, Expiration lifecycle rule count, Noncurrent version transition lifecycle rule count, Noncurrent version expiration lifecycle rule count, Abort incomplete multipart upload lifecycle rule count, and Total lifecycle rule count. After you’ve identified buckets with no S3 Lifecycle rules, you can add S3 Lifecycle rules. For more information, see the documentation on setting an S3 Lifecycle configuration on a bucket and examples of S3 Lifecycle configuration.
Figure 5: Using Lifecycle rule count metrics to identify buckets that are missing the right rules
Conclusion
By using S3 Storage Lens Advanced, DoorDash uncovered the insights needed to successfully grow their Amazon S3 usage by 2.6X while reducing unit costs by 30%. These savings are being used to reinvest in forward-looking projects. In the long term, DoorDash is committed to continue using S3 Storage Lens Advanced to consistently monitor their Amazon S3 storage usage to make sure that they are getting the most value from their storage spend.
“We were able to take the learnings from this Amazon S3 audit exercise and drive governance into our new self-service tooling for S3 bucket management. New S3 buckets are created by default (with the option to modify) in cold storage, with S3 Lifecycle policies that implement the appropriate data retention strategies and cleanup failed incomplete multipart uploads, with S3 Versioning disabled. The centralized observability provided by S3 Storage Lens simplified the process of storage optimization and the targeted metrics made sure that we identified significant storage cost savings. We look forward to partnering with Amazon S3 as we continue to develop new features and products to delight our partners and customers alike.”
– Levon Stepanian, Cloud FinOps Lead, DoorDash
Get started with Amazon S3 Storage Lens by taking the hands-on tutorial and reviewing the user guide.