My Amazon EMR application fails with an HTTP 404 "Not Found" AmazonS3Exception

Last updated: 2019-11-21

When I run an application on Amazon EMR, the application fails with the HTTP 404 "Not Found" AmazonS3Exception:

java.io.IOException: com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 3FEDF6CC9E4A8102; S3 Extended Request ID: 4oznJE5O44ySDrMMQ4g3F0L1+1ETqD0ANPWc0kYw0278zRX+TovNyu1ceLfcw21jasFxkkPfOuM=), S3 Extended Request ID: 4oznJE5O44ySDrMMQ4g3F0L1+1ETqD0ANPWc0kYw0278zRX+TovNyu1ceLfcw21jasFxkkPfOuM=

Resolution

This error indicates that your application tried to access an Amazon Simple Storage Service (Amazon S3) file or path that doesn't exist. Here some common causes and solutions:

  • The Amazon S3 path was mistyped or deleted: Before launching your application, confirm that you correctly entered the Amazon S3 path and that the path exists.
  • Other applications or accounts are accessing the same Amazon S3 files: It's possible that the file or path was deleted by another application or account. Before launching your application, check if other applications or accounts are actively accessing the same Amazon S3 path.
  • Enable consistent view: Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3 bucket. If you make a HEAD or GET request to the key name (for example, to find if the object exists) before creating the object, Amazon S3 provides eventual consistency for read-after-write. For example, if you PUT an object in Amazon S3 and then immediately make a HEAD or GET request for that object, you might get a "Not Found" error. To resolve these errors, enable EMRFS consistent view. For more information, see Consistent View.