How do I connect my Amazon SageMaker Studio notebook with an Amazon Redshift cluster?

2 minute read
0

I want to connect my Amazon SageMaker Studio notebook with an Amazon Redshift cluster.

Resolution

Publicly accessible cluster

If the Redshift cluster is publicly accessible, then you can access the cluster from either of the following:

  • A SageMaker domain launched with public internet only and no Amazon Virtual Private Cloud (Amazon VPC) access
  • A SageMaker Studio domain launched in an Amazon VPC

If the Redshift cluster is in a different VPC, then configure a VPC peering connection to make sure that Studio can access the cluster.

Private cluster

If the Redshift cluster is private, then you can access the cluster only through a SageMaker Studio domain launched in an Amazon VPC. If the cluster is in a different VPC, configure a VPC peering connection to make sure that Studio can access the cluster.

Additional requirements

Be sure that the following requirements are met for both types of clusters:

  • The security group attached to the SageMaker Studio allows outbound traffic to ephemeral ports. When a Studio client connects to a Redshift server, a random port from the ephemeral port range (1024-65535) becomes the client's source port.
  • The security group attached to the Redshift cluster allows inbound connection from the security group attached to the SageMaker Studio domain on port 5439.
  • If you configured custom DNS, verify that the DNS server used by the Studio VPC can resolve the hostname of the Redshift cluster.

Related information

Connect to an external data source

Using the Amazon Redshift data API to interact from an Amazon SageMaker Jupyter notebook

Read the Docs documentation for Ingest data with Redshift

AWS OFFICIAL
AWS OFFICIALUpdated a year ago