Why is the RefreshCache operation taking a long time on my file gateway?
Last updated: 2020-09-25
I initiated the RefreshCache operation on my file gateway in AWS Storage Gateway. However, the operation is taking a long time to complete. What's the reason for this delay?
The RefreshCache operation identifies the changes (updates, uploads, or deletes) of Amazon Simple Storage Service (Amazon S3) objects since the last time the gateway identified and cached the objects. To do this operation, the file gateway runs a recursive LIST operation on the S3 bucket, and then runs a HEAD object operation on every object that comes back from the LIST operation. The HEAD operation grabs the metadata, which is then stored in the file gateway cache.
The following factors can impact how long it takes for a RefreshCache operation to complete:
- If there's a large number of objects in the S3 bucket, then the run time of RefreshCache increases. This is because the file gateway runs a HEAD object against all objects in the bucket.
- RefreshCache operations are specific to individual file shares within a file gateway. One file share supports two RefreshCache API operations at any given time. If you send more requests to initiate a cache refresh, then more operations are triggered even before the completion of operations that are in progress. This can result in an InvalidGatewayRequestException error.
- S3 buckets can support 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix. These supported requests rates also apply to the requests made by the file gateway to your S3 buckets, which impacts how quickly a RefreshCache operation can be completed. The run time of RefreshCache can increase if the S3 bucket is also being used by services other than the file gateway.
Consider the following ways you can decrease the run time of a RefreshCache operation:
- You can reduce the number of objects in the bucket.
- You can deploy multiple file shares that correspond to separate prefixes in the S3 bucket, instead of having one file share for the entire bucket. Note: You can create up to 10 file shares for an individual file gateway. Because the RefreshCache operations are run per file share, this approach can help reduce the time it takes to complete individual RefreshCache operations.
- If you're using one file share for an entire S3 bucket, consider focusing the RefreshCache operations on specific prefixes or folders of the bucket that are being updated with new objects. This reduces the scope of the operation, which can help reduce the run time. You can target RefreshCache operations to specific folders when you run the operation using the AWS Command Line Interface (AWS CLI) or the Storage Gateway API. This option is currently not available in the Storage Gateway console.
Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.
- You can run the RefreshCache operation during off-peak times for other requests to the S3 bucket. You can use AWS Lambda and Amazon CloudWatch to trigger the operation on a timer.