I get a “The provided token is malformed or otherwise invalid” error when launching an Amazon EMR cluster using Hive and Presto in the AWS China (Beijing) Region

Last updated: 2019-04-22

I launched an Amazon EMR cluster in the AWS China (Beijing) Region (cn-north-1). I used Presto and Apache Hive to create an external table from an Amazon Simple Storage Service (Amazon S3) bucket. When I query the table using Hive and Presto, I get an error message like this:

presto:default> select * from mydata;
Query 20160712_072348_00009_qqx96, FAILED, 1 node
Splits: 1 total, 0 done (0.00%)
0:03 [0 rows, 0B] [0 rows/s, 0B/s]
Query 20160712_072348_00009_qqx96 failed: The provided token is malformed or otherwise invalid. (Service: Amazon S3; Status Code: 400; Error Code: InvalidToken; Request ID: 841753ED1D9E8250)

Short Description

This error happens because Presto does not automatically use the Region that the S3 bucket is in. To resolve this error, set the hive.s3.pin-client-to-current-region property to true on a running cluster or when launching a new cluster.

Resolution

To resolve the error on a running cluster:

1.    On each node, open the hive.properties file and then set the hive.s3.pin-client-to-current-region property to true. Example:

sudo vim /etc/presto/conf/catalog/hive.properties
hive.s3.connect-timeout=2m
hive.s3.max-backoff-time=10m
...
hive.s3.pin-client-to-current-region=true

2.    Restart Presto on each node:

sudo restart presto-server

To resolve the error when launching a new cluster, use the AWS Management Console or the AWS Command Line Interface (AWS CLI).

AWS Management Console:

1.    Open the Amazon EMR console.

2.    Choose Create cluster, and then choose Go to advanced options.

3.    On the Software Configuration page, in the Edit software settings section, choose Enter configuration.

4.    Enter the following command in the configuration box:

classification=presto-connector-hive,properties=[hive.s3.pin-client-to-current-region=True]

5.    Finish creating the cluster.

AWS Command Line Interface (CLI):

Use the create-cluster command. Include the following JSON text in the JSON file that you specify for the --configurations parameter. For more information, see Supplying a Configuration Using the AWS CLI when Creating a Cluster.

[
    {
        "Classification":"presto-connector-hive",
        "Properties":{"hive.s3.pin-client-to-current-region":"true"}
    }
]

You should now be able to query a table using Hive and Presto in the China (Beijing) Region.


Did this article help you?

Anything we could improve?


Need more help?