How can I resolve the "Connect timeout on endpoint URL" error on a persistent Amazon EMR JupyterHub notebook that's in a private subnet?

Last updated: 2020-06-12

I configured persistence for an Amazon EMR JupyterHub cluster that's in a private subnet. When the cluster tries to reach Amazon Simple Storage Service (Amazon S3), I see an error like this: "botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: https://s3.amazonaws.com/your-jupyter-backups/jupyter/jupyterhub-user-name."

Short Description

When you configure persistence for a notebook, s3.amazonaws.com is the default endpoint. Because this is a public address, EMR clusters that are in a private subnet can't reach the endpoint. To resolve this problem, configure Jupyter to use the Amazon S3 endpoint that corresponds to the Region that you're using (for example, https://s3-eu-west-1.amazonaws.com).

Resolution

You can configure the Region and endpoint on a running cluster or when you launch a new cluster.

Configuring the Region and endpoint on a running cluster

Add the Region and the corresponding endpoint to /etc/jupyter/jupyter_notebook_config.py. The following example uses the Europe (Ireland) Region. For a list of Regions and their endpoints, see AWS service endpoints.

sudo vim /etc/jupyter/jupyter_notebook_config.py

config.S3ContentsManager.endpoint_url = "https://s3-eu-west-1.amazonaws.com"
config.S3ContentsManager.region_name = "eu-west-1"

Configuring the Region and endpoint on a new cluster

Add a configuration object similar to the following when you launch the cluster. You must include the escape characters ("\"). If you don't, the double quotes aren't transferred to the file and the Python code fails.

[
    {
        "Classification": "jupyter-s3-conf",
        "Properties": {
            "s3.persistence.enabled": "true",
            "s3.persistence.bucket": "my-precious-bucket"
        }
    },
    {
        "Classification": "jupyter-notebook-conf",
        "Properties": {
            "config.S3ContentsManager.endpoint_url":  "\"https://s3-eu-west-1.amazonaws.com\"",
            "config.S3ContentsManager.region_name": "\"eu-west-1\""
        }
    }    
]

Did this article help you?

Anything we could improve?


Need more help?