How can I resolve the "HIVE_METASTORE_ERROR" error when I query a table in Amazon Athena?

5 minute read
0

I get the error "HIVE_METASTORE_ERROR" when I query my Amazon Athena table.

Resolution

See the following types of "HIVE_METASTORE_ERROR" errors for the cause and resolution.

Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.

"HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: Error: : expected at the position 1234 of struct<test_column> but '/' is found. (Service: null; Status Code: 0; Error Code: null; Request ID: null)"

-or-

"HIVE_METASTORE_ERROR: com.amazonaws.services.datacatalog.model.InvalidInputException: Error: : expected at the position 11884 of ... but '#' is found"

A column name in the queried table includes a special character, or a column name in the partition schema includes a special character. Athena doesn't support special characters other than underscore. For more information, see Names for tables, databases, and columns.

To resolve this error, you must replace the special character in the column name with an underscore. Then, edit the existing schema of the table.

The following example AWS Glue table contains a data file named "a.txt" with the special character "/":

col/1,col_2,col/3
data1,data2,data3
data4,data5,data6
  1. To replace the special character "/" with a underscore "_", run the following script:

    string = open('a.txt').readlines()
    string[0] = string[0].replace('/', '_')
    open('b.txt', 'w').write(string)

    Example output with a new data file "b.txt":

    col_1,col_2,col_3
    data1,data2,data3
    data4,data5,data6
  2. Edit the existing schema of the table from the AWS Glue console, and then replace '/' with any other character that's supported by Athena.

"HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: Required Table Storage Descriptor is not populated"

-or-

"HIVE_METASTORE_ERROR: Table is missing storage descriptor"

The StorageDescriptor parameter contains information about the physical storage of the table. You get this error if one or more partitions on the table don't have the partition location set because of corrupt partitions.

To resolve this error, choose one or more of the following solutions:

  • If your table is already partitioned, and the data is loaded in Amazon Simple Storage Service (Amazon S3) Hive partition format, then load the partitions. Run a command that's similar to the following example:
    Note: Be sure to replace doc_example_table with the name of your table.

    MSCK REPAIR TABLE doc_example_table
  • If the MSCK REPAIR TABLE command doesn't resolve the issue, then drop the table and create a new table with the same definition. Then, run the MSCK REPAIR TABLE command on the new table.

  • Create a separate folder in the Amazon S3 bucket, and then move the data files that you want to query into that folder. Create an AWS Glue crawler that points to that folder instead of the bucket.

"HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: java.io.IOException: Response payload size (11112222 bytes) exceeded maximum allowed payload size (6291556 bytes)"

You use an AWS Lambda function to run Athena queries against a cross-account AWS Glue Data Catalog or an external Hive metastore. However, Lambda has an invocation payload limit of 6 MB. You get this error when the size of the object that's returned from Lambda is more than 6 MB. The Lambda payload limit is a hard limit and can't be increased. For more information, see Lambda quotas.

To resolve this error, choose one or more of the following solutions:

  • Upload the Lambda function's response payload as an object into an Amazon S3 bucket. Then, include this object as payload in the Lambda function response. For information on how to generate a presigned URL for your object, see Sharing objects with presigned URLs.
  • If your table is partitioned, and your use case permits, then query only the specific partition.
  • When you create the Lambda function, specify the spill location in Amazon S3. Responses that are larger than the threshold spill into the specified S3 location.

"HIVE_METASTORE_ERROR: Failed to fetch partitions from Glue Data Catalog"

This error occurs for one of the following reasons:

  • The column name present in the partition metadata contains special characters other than an underscore. For more information, see Names for tables, databases, and columns.
  • The SerDe Information is missing from the StorageDescriptor defined in the partition metadata.

To identify the partition with the issue, do one of the following:

To resolve the partition issue, Run the DDL statement ALTER TABLE DROP PARTITION to drop the partition. Then, do one of the following:

Related information

Troubleshooting in Athena

AWS OFFICIAL
AWS OFFICIALUpdated a month ago