Why does my AWS Glue crawler fail with an internal service exception?

8 minute read

My AWS Glue crawler fails with the error "ERROR : Internal Service Exception".

Resolution

Crawler internal service exceptions can be caused by transient issues. Before you start to troubleshoot, run the crawler again. If you still get an internal service exception, then check for the following common issues.

Data issues

If your AWS Glue crawler is configured to process a large amount of data, then the crawler might face an internal service exception. Review the causes of data issues to remediate:

If you have a large number of small files, then the crawler might fail with an internal service exception. To avoid this issue, use the S3DistCp tool to combine smaller files. You incur additional Amazon EMR charges when you use S3DistCp. Or, you can set exclude patterns and crawl the files iteratively. Finally, consider turning on sampling to avoid scanning all of the files within a prefix.
If your crawler is nearing the 24 hour timeout value, then split the workflow to prevent memory issues. For more information, see Why is the AWS Glue crawler running for a long time?

Note: The best way to resolve data scale issues is to reduce the amount of data processed.

Inconsistent Amazon Simple Storage Service (Amazon S3) folder structure

Over time, your AWS Glue crawler encounters your data in a specific format. However, inconsistencies in upstream applications can trigger an internal service exception error.

There might be inconsistency between a table partition definition on the Data Catalog and a Hive partition structure in Amazon S3. Differences like this can issues for your crawler. For example, the crawler might expect objects to be partitioned as "s3://awsdoc-example-bucket/yyyy=xxxx/mm=xxx/dd=xx/[files]". But suppose that some of the objects fall under "s3://awsdoc-example-bucket/yyyy=xxxx/mm=xxx/[files]" instead. When this happens, the crawler fails and the internal service exception error is thrown.

If you modify a previously crawled data location, then an internal service exception error with an incremental crawl can occur. This happens because one of these conditions are met:

An Amazon S3 location that's known to be empty is updated with data files
Files are removed from an Amazon S3 location that's known to be populated with data files

If you make changes in the Amazon S3 prefix structure, then this exception is triggered.

If you think that there have been changes in your S3 data store, then it's a best practice to delete the current crawler. After deleting the current crawler, create a new crawler on the same S3 target using the Crawl all folders option.

AWS Key Management Service (AWS KMS) issues

If your data store is configured with AWS KMS encryption, then check the following:

Confirm that your crawler's AWS Identity and Access Management (IAM) role has the necessary permissions to access the AWS KMS key.
Confirm that your AWS KMS key policy is properly delegating permissions.
Confirm that the AWS KMS key still exists, and is in the Available status. If they AWS KMS key is pending deletion, then the internal service exception is triggered.

For more information, see Working with security configurations on the AWS Glue console and Setting up encryption in AWS Glue.

AWS Glue Data Catalog issues

If your Data Catalog has a large number of columns or nested structures, then the schema size might exceed the 400 KB limit. To address exceptions related to the Data Catalog, check the following:

Be sure that the column name lengths don't exceed 255 characters and don't contain special characters. For more information about column requirements, see Column.
Check for columns that have a length of 0. This can occur if the columns in the source data don't match the data format of the Data Catalog table.
In the schema definition of your table, be sure that the Type value of each of your columns doesn't exceed 131,072 bytes. If this limit is surpassed, your crawler might face an internal service exception. For more information, see Column structure.
Check for malformed data. For example, if the column name doesn't conform to the regular expression pattern "[\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\t]", then the crawler doesn't work.
If your data contains DECIMAL columns in (precision, scale) format, then confirm that the scale value is less than or equal to the precision value.
Your crawler might fail with an "Unable to create table in Catalog" or "Payload size of request exceeded limit" error message. When this happens, monitor the size of the table schema definition. There's no limitation on the number of columns that a table in the Data Catalog can have. But, there is a 400 KB limit on the total size of the schema. A large number of columns contributes to the total schema size exceeding the 400 KB limit. Potential workarounds include breaking the schema into multiple tables and removing the unnecessary columns. You can also consider decreasing the size of metadata by reducing column names.

Amazon S3 issues

Be sure that the Amazon S3 path doesn't contain special characters.
Confirm that the IAM role for the crawler has permissions to access the Amazon S3 path. For more information, see Create an IAM role for AWS Glue.
Remove special ASCII characters such as ^, %, and ~ from your data where possible. Or use custom classifiers to classify your data.
Confirm that the S3 objects use the STANDARD storage class. To restore objects to the STANDARD storage class, see Restoring an archived object.
Confirm that the include and exclude patterns in the crawler configuration match the S3 bucket paths.
If you're crawling an encrypted S3 bucket, then confirm that the IAM role for the crawler has the appropriate permissions for the AWS KMS key. For more information, see Working with security configurations on the AWS Glue console and Setting up encryption in AWS Glue.
If you're crawling an encrypted S3 bucket, be sure that the bucket, AWS KMS key, and AWS Glue job are in the same AWS Region.
Check the request rate on the S3 bucket that you're crawling. If it's high, consider creating more prefixes to parallelize reads. For more information, see Best practices design patterns: optimizing Amazon S3 performance.
Be sure that the S3 resource path length is less than 700 characters.

Amazon DynamoDB issues

Be sure that the table has enough read capacity units.
Be sure that the IAM role that you use to run the crawler has the dynamodb:Scan permission. For more information, see DynamoDB API permissions: actions, resources, and conditions reference.
Be sure that the table name doesn't include white space characters.

JDBC issues

If you're crawling a JDBC data source that's encrypted with AWS KMS, then check the subnet that you're using for the connection. The subnet's route table must have a route to the AWS KMS endpoint. This route can go through an AWS KMS supported virtual private cloud (VPC) endpoint or a NAT gateway.
Be sure that you're using the correct Include path syntax. For more information, see Defining crawlers.
If you're crawling a JDBC data store, then confirm that the SSL connection is configured correctly. If you're not using an SSL connection, then be sure that Require SSL connection isn't selected when you configure the crawler.
Confirm that the database name in the AWS Glue connection matches the database name in the crawler's Include path. Also, be sure that you enter the Include path correctly. For more information, see Include and exclude patterns.
Be sure that the subnet that you're using is in an Availability Zone that's supported by AWS Glue.
Be sure that the subnet that you're using has enough available private IP addresses.
Confirm that the JDBC data source is supported with the built-in AWS Glue JDBC driver.

AWS KMS issues when using a VPC endpoint

If you're using AWS KMS, then the AWS Glue crawler must have access to AWS KMS. To grant access, select the Enable Private DNS Name option when you create the AWS KMS endpoint. Then, add the AWS KMS endpoint to the VPC subnet configuration for the AWS Glue connection. For more information, see Connecting to AWS KMS through a VPC endpoint.

Related information

Working with crawlers on the AWS Glue console

Encrypting data written by crawlers, jobs, and development endpoints

Topics

Analytics

Relevant content

S3 event notification Glue Crawler fails with Internal Service Exception
Denys
asked 2 years ago
Glue Crawler fails with Internal Service Exception despite recommended fixes
jtownsend
asked 8 months ago
AWS Bedrock `Internal Server Exception`
AWS-User-6890053
asked 6 days ago
ERROR : Internal Service Exception of Glue Crawler
ivanychev
asked 6 months ago
AWS Glue Crawler: Got an error Internal Service Exception
Tham
asked 8 months ago
How can I automatically start an AWS Glue job when a crawler run completes?
AWS OFFICIALUpdated 2 years ago
How do I resolve the error "Failed to start the job flow due to an internal error" in Amazon EMR?
AWS OFFICIALUpdated a year ago
How do I resolve the "failed to obtain in-memory shard lock" exception in Amazon OpenSearch Service?
AWS OFFICIALUpdated a year ago
Why does my AWS Glue test connection fail?
AWS OFFICIALUpdated 3 years ago
EMR Cluster failure with "Failed to start the job flow due to an internal error"
SUPPORT ENGINEER
Yokesh NK
published 6 days ago