How do I resolve the "The provided key element does not match the schema" error when importing DynamoDB tables using Hive on Amazon EMR?

2 minute read

When I try to import Amazon DynamoDB tables into Amazon EMR using Hive, I get the error "The provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code."

Resolution

This error usually happens when you have an incorrect schema, corrupt data, or mismatched data. If you still get the error message after ruling out these common causes, then check the Hive application logs. If you turned on logging, then you can find the logs on Amazon Simple Storage Service (Amazon S3) in the location that looks similar to this:

s3://example-log-location/example-cluster-id/node/example-ec2-master-instance-id/applications/hive

Otherwise, you can find the logs in the /mnt/var/log/hive directory on the master node of the EMR cluster. You can connect to the master node, and then check for logs. The logs look similar to the following:

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"countryasin":"LOCATION '${INPUT}';","hts_type":null,"hts_code":null}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:565)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:86)
... 17 more
Caused by: java.lang.RuntimeException: com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: 0FF3KB36M2SJD8E79BUPOUP943VV4KQNSO5AEMVJF66Q9ASUAAJG)

The row that's mentioned in the error message ({"countryasin":"LOCATION '${INPUT}';","hts_type":null,"hts_code":null}) is part of the Hive script. This Hive script is in the same Amazon Simple Storage Service (Amazon S3) location as the input files. The import job is sending the Hive script to the DynamoDB table as data, as well as using it in the import job. To resolve this problem, move the Hive script to a different Amazon S3 location.

Related information

Optimizing performance for Amazon EMR operations in DynamoDB

DynamoDBMapper class

View log files

Topics

Analytics

Relevant content

"Query condition missed key schema element" when querying DynamoDB from Step Function
Accepted Answer
DB
asked a year ago
Query condition missed key schema element -> trying to read from dynamodb table
dmalonas
asked 4 months ago
Glue Crawler error: Folder partition keys do not match table partition keys
sks_dk
asked 2 years ago
How to fix 'The provided key element does not match the schema' error?
petsquare
asked 10 months ago
The request signature we calculated does not match the signature you provided
rePost-User-1756780-Ngozi
asked 9 months ago
How do I resolve the "Attribute 'Key' does not exist" error when I use the Fn::GetAtt function on my resource provider resource in CloudFormation?
AWS OFFICIALUpdated 2 years ago
How do I resolve "OutOfMemoryError" Hive Java heap space exceptions on Amazon EMR that occur when Hive outputs the query results?
AWS OFFICIALUpdated 2 years ago
How do I resolve the DELETE_FAILED error when deleting the capacity provider in Amazon ECS?
AWS OFFICIALUpdated 2 years ago
How can I resolve the "MalformedJson" error when importing CSV or TSV files into DynamoDB using the default Data Pipeline template?
AWS OFFICIALUpdated 2 years ago
How to use the Python json module when decoding DynamoDB items
EXPERT
Brettski-AWS
published 7 months ago