Why does my AWS Glue test connection fail?
Last updated: 2021-07-27
I want to troubleshoot a failed test connection in AWS Glue.
Check for the following common problems.
- Check connectivity to JDBC data stores: AWS Glue creates elastic network interfaces with private IP addresses in the connection's subnet. This means that AWS Glue can't use the public internet to connect to the data store.
- If the data store is outside the Amazon Virtual Private Cloud (Amazon VPC), then the subnet's route table must have a route to a NAT gateway in a public subnet. Otherwise, the connection times out.
Note: The data store outside the Amazon VPC might be an on-premises data store or an Amazon Relational Database Service (Amazon RDS) resource with a public hostname.
- If the data store is in the Amazon VPC, then confirm that the connection's security groups and network access control list (network ACL) allow traffic to the data store.
- Check the connection's security groups: One of the security groups associated with the connection must have a self-referencing inbound rule that's open to all TCP ports. Similarly, one of the security groups must also be open to all outbound traffic. You can use a self-referencing rule to restrict outbound traffic to the Amazon VPC. For more information, see Setting up a VPC to connect to JDBC Data Stores.
- Check the number of free IP addresses: The number of free IP addresses in the subnet must be greater than the number of data processing units (DPUs) specified for the job. This allows AWS Glue to create elastic network interfaces in the specified subnet.
- Confirm that the subnet can access Amazon Simple Storage Service (Amazon S3): Provide an Amazon S3 endpoint or provide a route to a NAT gateway in your subnet's route table. For more information, see Error: Could not find S3 endpoint or NAT Gateway for subnetId in VPC.
- Check if you have an AWS KMS VPC endpoint: If your AWS Glue Data Catalog is encrypting connections, be sure that you have a route to AWS KMS. For example, this route can be an AWS KMS VPC interface endpoint. For more information, see Connecting to AWS KMS through a VPC endpoint.
- Check if the AWS Glue connection and the database use different VPCs: Your test connection fails with a timeout error when the following conditions are true:
The database is not publicly accessible.
The AWS Glue job is attached to a connection that uses a different VPC without VPC peering.
This issue can be resolved by creating a dedicated AWS Glue VPC and setting up the corresponding VPC peerings with your other VPCs as needed. For more information, see Connect to and run ETL jobs across multiple VPCs using a dedicated AWS Glue VPC.
- Check the connectivity to the on-premises data store: If you are testing the AWS Glue connection to an on-premises database, then it's a best practice to connect to an Amazon Elastic Compute Cloud (Amazon EC2) instance in the same VPC, subnet, and security group used for the connection. Then, run the following tests from the Amazon EC2 instance. If you have issues running the commands, check your VPN and the configurations of VPC, subnet, security group, and network access control lists (ACLs). Be sure that these configurations do not block the connectivity from VPC to your on-premises database or create firewall issues from the on-premises database. For more information, see How to access and analyze on-premises data stores using AWS Glue.
$ telnet hostname port $ nc -zv hostname port $ dig hostname $ traceroute -AnT -p IP port
- Choose the correct IAM role: The AWS Identity and Access Management (IAM) role that you select for the test connection must have a trust relationship with AWS Glue. An easy way to do this is to choose a service-linked role that has the AWSGlueServiceRole policy attached to it.
- Check the connection's IAM role: If the connection password is encrypted with AWS Key Management Service (AWS KMS), then confirm that the connection's IAM role allows the kms:Decrypt action for the key. For more information, see Setting up encryption in AWS Glue.
- Check the connection logs: Logs from test connections are located in Amazon CloudWatch Logs under /aws-glue/testconnection/output. Check the logs for error messages.
- Check the SSL settings: If the data store requires SSL connectivity for the specified user, be sure to select Require SSL connection when you create the connection on the console. Don't select this option if the data store doesn't support SSL.
- Check the JDBC username and password: The user who is accessing the JDBC data store must have sufficient access permissions. For example, AWS Glue crawlers require SELECT permissions. A job that writes to a data store requires INSERT, UPDATE, and DELETE permissions.
- Check the JDBC URL syntax: Syntax requirements vary by database engine. For more information, see Adding an AWS Glue connection and review the examples under JDBC URL.
- Check the connection type:
- Be sure to choose the correct connection type. When you choose Amazon RDS or Amazon Redshift for Connection type, AWS Glue auto populates the VPC, subnet, and security group.
- If you need to connect to MySQL, then be aware that the test connection feature works only for MySQL 5.x versions. MySQL version 8 is not supported with the built-in AWS Glue JDBC driver. If you test the connection against a MySQL version newer than version 5.x, then you might get a connection timeout error. However, you can still use your AWS Glue connection to connect to MySQL version 8 with a workaround. Use the connection on an extract, load, and transform (ETL) job by manually providing the compatible driver JAR for MySQL version 8 and later. Then, load this JAR file into your job similar to how you load any JDBC driver on a Spark job. For more information, see Connection types and options for ETL in AWS Glue.
- Rule out DNS problems: To rule out DNS issues, use the data store's public or private IP address as the JDBC URL for the AWS Glue connection. When you do this, you must uncheck Require SSL connection because you're no longer using a domain name.
- Check if the driver is incompatible: If the connection fails because of an incompatible driver, provide the correct driver as an extra JAR file in the job properties, along with the failed connection name. (When you specify the connection name as a job property, AWS Glue uses the connection's networking settings, such as the VPC and subnets.) Then, override the default AWS Glue data store drivers by manually creating the Apache Spark dataframe using the JAR file that you provided in the job properties. After creating the dataframe, you can optionally convert it into an AWS Glue DynamicFrame. For more information, see fromDF.
- Check if the JDBC data store is publicly accessible: Connect to the data store using MySQL Workbench and the JDBC URL. Or, launch an Amazon EC2 instance that has SSH access to the same subnet and security groups used for the connection. Then, connect to the instance using SSH and run the following commands to test connectivity.
$ dig hostname $ nc -zv hostname port