MSCK REPAIR TABLE detects partitions in Athena but doesn't add them to the AWS Glue Data Catalog

Last updated: 2020-06-18

When I run MSCK REPAIR TABLE, Amazon Athena returns a list of partitions, but then fails to add the partitions to the table in the AWS Glue Data Catalog.

Short description

Here are some common causes of this behavior:

  • The AWS Identity and Access Management (IAM) user or role doesn't have a policy that allows the glue:BatchCreatePartition action.
  • The Amazon Simple Storage Service (Amazon S3) path is in camel case instead of lower case (for example, userId instead of userid).

Resolution

Allow glue:BatchCreatePartition in the IAM policy

Review the IAM policies attached to the user or role that you're using to execute MSCK REPAIR TABLE. When you use the AWS Glue Data Catalog with Athena, the IAM policy must allow the glue:BatchCreatePartition action. If the policy doesn't allow that action, then Athena can't add partitions to the metastore. For an example of an IAM policy that allows the glue:BatchCreatePartition action, see AmazonAthenaFullAccess managed policy.

Change the Amazon S3 path to lower case

The Amazon S3 path name must be in lower case. If the path is in camel case, MSCK REPAIR TABLE doesn't add the partitions to the AWS Glue Data Catalog. For example, if the Amazon S3 path is userId, the following partitions aren't added to the AWS Glue Data Catalog:

  • s3://awsdoc-example-bucket/path/userId=1/
  • s3://awsdoc-example-bucket/path/userId=2/
  • s3://awsdoc-example-bucket/path/userId=3/

To resolve this issue, use lower case instead of camel case:

  • s3://awsdoc-example-bucket/path/userid=1/
  • s3://awsdoc-example-bucket/path/userid=2/
  • s3://awsdoc-example-bucket/path/userid=3/