How do I troubleshoot the MAX_FAILED_UNIQUE_FETCHES error in Amazon EMR?

Last updated: 2019-11-07

When I run an s3-dist-cp or MapReduce job on an Amazon EMR cluster that has in-transit encryption enabled, my application fails with one of the following errors:

  • java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES
  • java.io.IOException: HTTPS hostname wrong

Short Description

These errors indicate that the reducers can't fetch the map outputs. Here are some common causes of this problem:

  • The Common Name (CN) in the SSL certificate doesn't match the DHCP options set for the cluster’s virtual private cloud (VPC).
  • The SSL certificate is not valid or is expired.

Resolution

Check the CN listed in the encryption artifacts

1.    Open the /etc/hadoop/conf/ssl-client.xml file to find the keystore.jks and truststore.jks passwords. You'll need the passwords in steps 2 and 3.

2.    Run the following command to see the CN that's specified in truststore.jks:

[hadoop@ip-172-xx-xx-xxx conf]$  keytool -list -v -keystore /usr/share/aws/emr/security/conf/truststore.jks

Enter keystore password:
Keystore type: jks
Keystore provider: SUN

Your keystore contains 1 entry

Alias name: emr-encryption-trust-store

Creation date: Oct 1, 2019

Entry type: trustedCertEntry

Owner: CN=*.ec2.internal, OU=MyDept, O=MyOrg, L=Seattle, ST=Washington, C=US

Issuer: CN=*.ec2.internal, OU=MyDept, O=MyOrg, L=Seattle, ST=Washington, C=US

Serial number: afeae69ad8e12345

Valid from: Tue Oct 01 22:59:49 UTC 2019 until: Wed Sep 30 22:59:49 UTC 2020

Certificate fingerprints:      
XXXXXXXXXXXXXXXXXXXXXXXXXXX

3.    Run the following command to see the CN that's specified in keystore.jks:

[hadoop@ip-172-xx-xx-xxx ~]$ keytool -list -v -keystore /usr/share/aws/emr/security/conf/keystore.jks

Enter keystore password:  
Keystore type: jks

Keystore provider: SUN

Your keystore contains 1 entry

Alias name: emr-encryption-key-store

Creation date: Oct 1, 2019

Entry type: PrivateKeyEntry

Certificate chain length: 1

Certificate[1]:

Owner: CN=*.ec2.internal, OU=MyDept, O=MyOrg, L=Seattle, ST=Washington, C=US

Issuer: CN=*.ec2.internal, OU=MyDept, O=MyOrg, L=Seattle, ST=Washington, C=US

Serial number: afeae69ad8e12345

Valid from: Tue Oct 01 22:59:49 UTC 2019 until: Wed Sep 30 22:59:49 UTC 2020

Certificate fingerprints:
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

In these examples, the CN for truststore.jks and keystore.jks is *.ec2.internal.

Check the CN listed in the DHCP option set

1.    Open the Amazon EMR console.

2.    Choose the name of the cluster.

3.    Choose the link for Subnet ID to open the Amazon Virtual Private Cloud (Amazon VPC) console.

4.    On the subnet Description tab, choose the link for the VPC.

5.    On the VPC Description tab, choose the link for the DHCP options set.

6.    In the Options column, note the domain name. If the options look like this, you are using a custom DHCP option set:

domain-name = example.com; domain-name-servers = 10.X.X.X,AmazonProvidedDNS;

If the options look like this, you're using the default DHCP option set:

domain-name = ec2.internal; domain-name-servers = AmazonProvidedDNS;

Note: If you're using a Region other than us-east-1, such as us-west-2, the output shows us-west-2.compute.internal instead of ec2.internal.

If you're using the default DHCP options set

Configure the private key (PEM file) to be a wildcard certificate that enables access to the Amazon VPC domain in which your cluster instances reside. For more information, see Using PEM Certificates.

If you're using a custom DHCP options set

Launch a new EMR cluster with a new SSL certificate. The CN in the new certificate must match the custom domain specified in the DHCP options set and must use a wildcard (*example.com). Here's an example of how to use OpenSSL to create a self-signed wildcard certificate for example.com:

Note: Self-signed certificates aren't trusted by browsers and shouldn't be used in production environments.

$ openssl req -x509 -newkey rsa:1024 -keyout privateKey.pem -out certificateChain.pem -days 365 -nodes -subj '/C=US/ST=Washington/L=Seattle/O=MyOrg/OU=MyDept/CN=*.example.com'
$ cp certificateChain.pem trustedCertificates.pem 
$ zip -r -X my-certs.zip certificateChain.pem privateKey.pem trustedCertificates.pem

For more information, see Using PEM Certificates.

Verify the CN name in the certificates

Run the following command to confirm that the CN matches the domain name in the DHCP option set:

openssl x509 -in certificateChain.pem -text -noout

Example output:

Subject: O = XX, CN = *.ec2.internal where CN is ec2.internal

Did this article help you?

Anything we could improve?


Need more help?