I get a “The provided token is malformed or otherwise invalid” error when I launch an Amazon EMR cluster using Hive and Presto in the AWS China (Beijing) Region

Last updated: 2020-10-14

I launched an Amazon EMR cluster in the AWS China (Beijing) Region (cn-north-1). I used Presto and Apache Hive to create an external table from an Amazon Simple Storage Service (Amazon S3) bucket. When I query the table using Hive and Presto, I get an error like this:

presto:default> select * from mydata;
Query 20200912_072348_00009_qqx96, FAILED, 1 node
Splits: 1 total, 0 done (0.00%)
0:03 [0 rows, 0B] [0 rows/s, 0B/s]
Query 20200912_072348_00009_qqx96 failed: The provided token is malformed or otherwise invalid. (Service: Amazon S3; Status Code: 400; Error Code: InvalidToken; Request ID: 811359ED1D9F8250)

Short description

In older Amazon EMR release versions, Presto doesn't automatically use the Region that the S3 bucket is in. Use one of the following options to resolve this error:

  • Upgrade to Amazon EMR release version 5.12.0 or later.
  • If you want to use Amazon EMR release version 5.11.x or earlier, set the hive.s3.pin-client-to-current-region property to true.

Resolution

Upgrade to Amazon EMR release version 5.12.0 or later

Launch a new cluster and choose Amazon EMR release version 5.12.0 or later. For more information, see About Amazon EMR releases.

Set hive.s3.pin-client-to-current-region property to true (version 5.11.x or earlier)

1.    On each node, open the hive.properties file and then set the hive.s3.pin-client-to-current-region property to true. Example:

sudo vim /etc/presto/conf/catalog/hive.properties
hive.s3.connect-timeout=2m
hive.s3.max-backoff-time=10m
...
hive.s3.pin-client-to-current-region=true

2.    Restart Presto on each node:

sudo restart presto-server
3.    To confirm that the new configuration works as expected, query a table using Hive and Presto in the China (Beijing) Region.

Did this article help?


Do you need billing or technical support?