Why do I get a “The provided token is malformed or otherwise invalid” error when I launch an Amazon EMR cluster using Hive and Presto in the AWS China (Beijing) Region?
Last updated: 2022-09-19
I launched an Amazon EMR cluster in the AWS China (Beijing) Region (cn-north-1). I used Presto and Apache Hive to create an external table from an Amazon Simple Storage Service (Amazon S3) bucket. When I query the table using Hive and Presto, I get an error similar to the following:
presto:default> select * from mydata; Query 20200912_072348_00009_qqx96, FAILED, 1 node Splits: 1 total, 0 done (0.00%) 0:03 [0 rows, 0B] [0 rows/s, 0B/s] Query 20200912_072348_00009_qqx96 failed: The provided token is malformed or otherwise invalid. (Service: Amazon S3; Status Code: 400; Error Code: InvalidToken; Request ID: 811359ED1D9F8250)
In earlier Amazon EMR release versions, Presto doesn't automatically use the Region that the S3 bucket is in. Use one of the following options to resolve this error:
- Upgrade to Amazon EMR release version 5.12.0 or later.
- To use Amazon EMR release version 5.11.x or earlier, set the hive.s3.pin-client-to-current-region property to true.
Upgrade to Amazon EMR release version 5.12.0 or later
Launch a new cluster and choose Amazon EMR release version 5.12.0 or later. For more information, see About Amazon EMR releases.
Set hive.s3.pin-client-to-current-region property to true (version 5.11.x or earlier)
1. On each node, open the hive.properties file and then set the hive.s3.pin-client-to-current-region property to true. Example:
sudo vim /etc/presto/conf/catalog/hive.properties hive.s3.connect-timeout=2m hive.s3.max-backoff-time=10m ... hive.s3.pin-client-to-current-region=true
2. Restart Presto on each node:
sudo restart presto-server
3. To confirm that the new configuration works as expected, query a table using Hive and Presto in the China (Beijing) Region.