How can I resolve the "HIVE_METASTORE_ERROR" error when I query a table in Amazon Athena?

Last updated: 2021-05-11

I get the error "HIVE_METASTORE_ERROR" when I query my Amazon Athena table.

Short description

The different types of "HIVE_METASTORE_ERROR" errors and their causes are the following:

  • HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: Error: : expected at the position 1234 of struct but '/' is found. (Service: null; Status Code: 0; Error Code: null; Request ID: null): You get this error if a column name in the queried table includes a special character, or when a column name in the partition schema includes a special character. Athena doesn't support special characters other than underscore. For more information, see Names for tables, databases, and columns.
  • HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: Required Table Storage Descriptor is not populated: The StorageDescriptor parameter contains information about the physical storage of the table. You get this error if one or more partitions on the table do not have the partition location set because of corrupt partitions.
  • HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: java.io.IOException: Response payload size (11112222 bytes) exceeded maximum allowed payload size (6291556 bytes): You get this error when using an AWS Lambda function to run Athena queries against a cross-account AWS Glue Data Catalog or an external Hive metastore. Lambda has an invocation payload limit of 6 MB. You get this error when the size of the object returned from Lambda is more than 6 MB. The Lambda payload limit is a hard limit and can't be increased. For more information, see Lambda quotas.

Resolution

HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: Error: : expected at the position 1234 of struct<test_column> but '/' is found. (Service: null; Status Code: 0; Error Code: null; Request ID: null)

To resolve this error, do the following:

  • Run the following custom script on your data to replace the special character in the column name with an underscore:
import re
string = open('a.txt').read()
new_str = re.sub('/', '_', string)
open('b.txt', 'w').write(new_str)

HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: Required Table Storage Descriptor is not populated

To resolve this error, choose one or more of the following solutions:

  • If your table is already partitioned, and the data is loaded in Amazon Simple Storage Service (Amazon S3) Hive partition format, then load the partitions by running a command similar to the following:
    Note: Be sure to replace doc_example_table with the name of your table.
MCSK REPAIR TABLE doc_example_table
  • If running the MCSK REPAIR TABLE command doesn’t resolve the issue, then drop the table and create a new table with the same definition. Then, run the MCSK REPAIR TABLE command on the new table.
  • Create a separate folder in the Amazon S3 bucket, and then move the data files that you want to query into that folder. Then, create an AWS Glue crawler pointing to that folder instead of the bucket.

HIVE_METASTORE_ERROR: com.facebook.presto.spi.PrestoException: java.io.IOException: Response payload size (11112222 bytes) exceeded maximum allowed payload size (6291556 bytes)

To resolve this error, choose one or more of the following solutions:

  • Upload the Lambda function's response payload as an object into an Amazon S3 bucket. Then, include this object as payload in the Lambda function response. For information on generating a presigned URL for your object, see Sharing an object with a presigned URL.
  • If your table is partitioned, and your use case permits, query only the specific partition.
  • Specify the spill location in Amazon S3 when you create the Lambda function. Responses that are larger than the threshold spill into the specified S3 location.

Did this article help?


Do you need billing or technical support?