Why does the MSCK REPAIR TABLE command take a long time to run?
Last updated: 2021-06-08
When I run the MSCK REPAIR TABLE command, a long time elapses before the results appear.
When I run the MSCK REPAIR TABLE command, the query times out.
You get this error because Amazon Athena recursively lists prefixes and objects in Amazon Simple Storage Service (Amazon S3) when running the MSCK REPAIR TABLE command. If you have too many Amazon S3 prefixes or objects, then the command might take a long time to complete or time out.
To resolve this error, do either of the following:
- Use an AWS Glue Crawler to add partitions to your Athena tables. For more information, see How crawlers work. Using an AWS Glue Crawler can reduce the time taken to load partitions if you have many Amazon S3 prefixes. For more information, see Incremental crawls in AWS Glue.
- Add partitions to the table using the ALTER TABLE ADD PARTITION command.
Consider using partition projection if your partitions follow predictable patterns. Athena generates partitions in-memory without needing to add them to the AWS Glue Data Catalog or retrieve them from the Data Catalog. Therefore, query processing times might reduce for heavily partitioned tables.