How does enhanced VPC routing work in Amazon Redshift?
Last updated: 2020-09-30
I'm trying to enable enhanced VPC routing in Amazon Redshift. How does enhanced VPC routing work and what are some important considerations for using it?
In Amazon Redshift, network traffic created by COPY, UNLOAD, and Amazon Redshift Spectrum flow through a network interface. This network interface is internal to the Amazon Redshift cluster, and is located outside of your Amazon Virtual Private Cloud (Amazon VPC). By default, the network traffic is then routed through the public internet to reach its destination.
However, when you enable Amazon Redshift enhanced VPC routing, Amazon Redshift routes the network traffic through a VPC instead. Amazon Redshift enhanced VPC routing uses an available routing option, prioritizing the most specific route for network traffic. The VPC endpoint is prioritized as the first route priority. If a VPC endpoint is unavailable, Amazon Redshift routes the network traffic through an internet gateway, NAT instance, or NAT gateway.
To determine whether you should enable Amazon Redshift enhanced VPC routing, consider the following use cases:
- Amazon S3 traffic for COPY or UNLOAD through a VPC gateway endpoint instead of passing through the public internet.
- SSH traffic (from running the COPY command through SSH ingestion) from a remote host in a VPC or on-premises server.
- AWS Glue, Amazon Athena, or Apache Hive metastore traffic for Redshift Spectrum through VPC interface endpoints.
- Federated queries to private Amazon Relational Database Service (Amazon RDS) instances located in a peered VPC.
To determine whether Amazon Redshift enhanced VPC routing supports your cluster needs, note the following considerations:
- Allows you to control network traffic.
- Enhances security because it uses a private IP address for network traffic.
- Affects the way Amazon Redshift accesses other resources. Therefore, enhanced VPC routing can sometimes create additional overhead when you configure a security group, network access control list (network ACL), or route table.
Note: If configured incorrectly, enhanced VPC routing can cause your COPY, UNLOAD, or Redshift Spectrum jobs to fail.
- Does not improve cluster performance.
Amazon Redshift's prioritization of routing methods
Important: When enhanced VPC routing is enabled, it does not automatically enable traffic flow through a VPC. A VPC endpoint must be created and specified in the route table of the subnet.
If multiple network pathways exist, Amazon Redshift routes the traffic through the most specific route available.
Example 1: Amazon Simple Storage Service (Amazon S3) gateway endpoint
In the following example, Amazon Redshift routes the network traffic through an Amazon S3 gateway endpoint ("vpce-xxxxx"):
Destination | Target ------------------------- 10.0.0.0/16 | local 0.0.0.0/0 | igw-xxxxx pl-6fa54006 | vpce-xxxxx
Note: Each subnet in your VPC must be associated with a route table.
Example 2: Internet, NAT gateway, or NAT instance
Here's an example of a subnet route table, where Amazon S3 traffic is routed through the internet gateway ("igw-xxxxx"):
Destination | Target ------------------------- 10.0.0.0/16 | local 0.0.0.0/0 | igw-xxxxx
Example 3: No available route to destination
If there are no routing methods available, and the route table cannot reach S3, the network traffic for COPY and UNLOAD times out like this:
Destination | Target ------------------------------ 10.0.0.0/16 | local
After several retries, a routing method that cannot reach S3 results in the following error message:
"ERROR: S3CurlException: Connection timed out after 50001 milliseconds, CurlError 28, multiCurlError 0, CanRetry 1, UserError 0"
Checking whether enhanced VPC routing is enabled
You can check whether VPC routing is enabled in Amazon Redshift, using one of the following approaches:
- The Amazon Redshift console: You can check whether enhanced VPC routing is enabled using the Amazon Redshift console. For more information, see the To create a cluster with enhanced VPC routing section in Enabling enhanced VPC routing.
- AWS Command Line Interface (AWS CLI): Use the describe-clusters and grep command to verify whether the enhanced VPC routing is set to "true".
- VPC Flow Logs: Use flow logs to capture information about the IP traffic going to and from network interfaces in your VPC.
Here's an example of the AWS CLI command syntax used to verify the enhanced VPC routing setting:
$ aws redshift describe-clusters --cluster-id <cluster-id> | grep EnhancedVpcRouting || EnhancedVpcRouting | True
Here's an example of a VPC flow log, which shows the COPY network traffic between a private Amazon Redshift IP address and an S3 bucket:
Account_ID ENI Source_IP Destination_IP Source_Port Destination_Port Protocol Packets Bytes Start_Time End_Time ……. 2 540754XXXXXX eni-01783841dad81XXXX 126.96.36.199 172.31.13.236 443 37516 6 279740 390798072 1589668161 1589668221 ACCEPT OK 2 540754XXXXXX eni-01783841dad81XXXX 172.31.13.236 188.8.131.52 37516 443 6 9206 368276 1589668161 1589668221 ACCEPT OK …….
- If you're using an Amazon S3 VPC endpoint, the S3 bucket should exist in the same Region as the Amazon Redshift cluster.
- Your VPC must have DNS support enabled. If you're using a custom DNS, then be sure that your Amazon S3 and AWS Glue service endpoints can resolve.
- Be sure to configure your AWS Glue interface endpoint so that traffic flows privately from Redshift Spectrum to AWS Glue through a VPC. Otherwise, a NAT gateway or internet gateway are required.
For more information about the requirements and constraints of using enhanced VPC routing, see Enabling enhanced VPC routing.