Federate access to your Amazon Redshift cluster with Active Directory Federation Services (AD FS): Part 2

In the first post of this series, Federating access to your Amazon Redshift cluster with Active Directory: Part 1, you set up Microsoft Active Directory Federation Services (AD FS) and Security Assertion Markup Language (SAML) based authentication and tested the SAML federation using a web browser.

In Part 2, you learn to set up an Amazon Redshift cluster and use federated authentication with AD FS to connect from a JDBC SQL client tool. You follow a detailed step-by-step process of using Amazon Redshift database authentication to simplify the credential management of database users and reuse what you may already have on premises. You can also integrate Amazon Redshift group-level privileges with federation.

You can use your corporate Active Directory (AD) and the SAML 2.0 AD FS identity provider to enable your users to easily access Amazon Redshift clusters using corporate user names without managing database users and passwords.

Overview of solution

This post walks you through setting up an Amazon Redshift cluster and federating database user authentication with AWS Identity and Access Management (IAM) and Amazon Redshift. You use the user you set up in your AD in Part 1 (Bob) to authenticate using AD FS and control access to database objects based on the group the user is assigned to.

You also learn how to set up SQL Client (SQL Workbench/J) and an Amazon Redshift JDBC driver to connect to a database cluster using the database user authentication method and the AD FS federated authentication method.

The following diagram illustrates the architecture of this solution.

The workflow contains the following steps:

The client (SQL Workbench/J) makes a request to the Identity Provider (IdP).
The IdP authenticates the user and sends back a SAML assertion.
The client calls an AWS SAML endpoint to AssumeRoleWithSAML.
AWS authenticates and returns temporary security credentials.
The client uses temporary credentials to connect to the Amazon Redshift cluster endpoint.

Walkthrough

The walkthrough includes the following steps:

Set up your Amazon Redshift cluster.
Download, set up, configure, and test different connection methods to Amazon Redshift using SQL Workbench/J and an Amazon Redshift JDBC driver.
Configure your Amazon Redshift connection profile with SQL Workbench/J.

Prerequisites

For this walkthrough, you should meet the following prerequisites:

Complete all the steps from Federating access to your Amazon Redshift cluster with Active Directory: Part 1:
- Set up domain controller and AD FS
- Configure federation on AWS with the AD FS
- Configure AWS as the relying party with AD FS
- Test AWS SAML federation using your web browser
- Copy the username and passwords for the AD users to use in this post
- Copy the ARN for the two SAML roles you created to use in this post
- Copy the AD domain name to use in this post
Have a basic understanding of SQL with any database and experience using a SQL client, such as SQL Workbench/J
Check with your AWS administrator to ensure you have access to the AWS Management Console with permissions to use Amazon Redshift and IAM

Setting up your Amazon Redshift cluster

To create your Amazon Redshift cluster, complete the following steps:

On the console, open Amazon Redshift.
Choose Create cluster.

For Node type¸ choose dc2.large.
For Nodes, enter 2.

For Cluster identifier, enter adfs-redshift-cluster.
For Database port, enter 5439.
Enter a master user name and password of your choice.

Keep everything else at their defaults and choose Create cluster.

Your cluster status initially shows the various Modifying stages.

When the cluster creation is complete, its status shows as Available.

Choose the cluster.
On the Properties tab, review the endpoint information and database configuration.

Make sure that for this walkthrough, Publicly accessible is set to Yes.

For more information about VPC accessibility, see Managing clusters in a VPC.

For production setup of the Amazon Redshift cluster, you shouldn’t set the cluster to be publicly accessible; instead you make it private. For more information, see How can I access a private Amazon Redshift cluster from my local machine?

Configuring your cluster security group

You can create a new security group for your cluster or use the default security group (located in the Network and security section on the Properties tab).

For the security group, choose Edit inbound settings.
Add a rule for Amazon Redshift for the IP range on your computer and the public IP for the Windows 2016 domain controller, which you set up in Part 1.

Choose Save.

Configuring and testing connection methods using SQL Workbench/J and Amazon Redshift JDBC driver

Before you get started, download the latest Amazon Redshift JDBC driver with AWS SDK. You use this for JDBC driver-based authentication in SQL Workbench/J.

Download and install SQL Workbench/J. For instructions, see Connect to your cluster by using SQL Workbench/J.
Open SQL Workbench/J and choose Manage Drivers.
Choose Amazon Redshift.
For Library, browse to the JDBC driver you downloaded.

Press Ok.

Configuring your Amazon Redshift connection profile with SQL Workbench/J

To set up federated access, you take a two-step approach: connecting with the superuser and connecting using federated authentication.

Connecting using the superuser

For this walkthrough, you create database objects, groups, and users and assign proper privileges to the groups on the database objects they are allowed to access.

Choose New Connection.
Construct a JDBC connection URL for the database standard user:

jdbc:redshift://<<redshiftclusterendpoint>>.redshift.amazonaws.com:5439/dev?ssl=true&UID=username&PWD=password

You can find the value for redshiftclusterendpoint on the Amazon Redshift console, under Properties, Endpoint. The username and password are the values you provided when you created the cluster.

Choose Test.

You should receive a successful connection message, as in the following screenshot.

Connect as awsuser (a superuser).

From SQL Workbench/J, enter the following commands to set up the following environment:

Create two database groups:

CREATE GROUP sales;
CREATE GROUP marketing;

Create two schemas:

CREATE SCHEMA sales;
CREATE SCHEMA marketing;

Create two tables in each schema:

CREATE TABLE IF NOT EXISTS marketing.employee
(
	n_empkey INTEGER   
	,n_name CHAR(25)   
	,n_regionkey INTEGER   
	,n_comment VARCHAR(152)   
)
DISTSTYLE AUTO
 SORTKEY (n_empkey);

CREATE TABLE IF NOT EXISTS sales.employee_sales
(
	n_empkey INTEGER   
	,n_name CHAR(25)   
	,n_regionkey INTEGER   
	,n_comment VARCHAR(152)   
)
DISTSTYLE AUTO
 SORTKEY (n_empkey);

Insert sample data into the two tables:

INSERT INTO marketing.employee
VALUES(1, 'Bob', 0, 'Marketing');

INSERT INTO sales.employee_sales
VALUES(1, 'John', 0, 'Sales');

Validate data is available in the tables:

Select * from marketing.employee;

Select * from sales.employee_sales;

You can now set up appropriate privileges for the sales and marketing groups. Groups are collections of users who are all granted privileges associated with the group. You can use groups to assign privileges by job function. For example, you can create different groups for sales, administration, and support and give the users in each group the appropriate access to the data they require for their work. You can grant or revoke privileges at the group level, and those changes apply to all members of the group, except for superusers.

Enter the following SQL queries to grant access to all tables in the sales schema to the sales group and access to all tables in the marketing schema to the marketing group:

ALTER DEFAULT PRIVILEGES IN SCHEMA sales
GRANT SELECT on TABLES to GROUP sales;
GRANT USAGE on SCHEMA sales to GROUP sales;
GRANT SELECT on ALL TABLES in SCHEMA sales to GROUP sales;

ALTER DEFAULT PRIVILEGES IN SCHEMA marketing
GRANT SELECT on TABLES to GROUP marketing;
GRANT USAGE on SCHEMA marketing to GROUP marketing;
GRANT SELECT on ALL TABLES in SCHEMA marketing to GROUP marketing;

Connect using federated authentication

You can pass the URL information in the connection profile two different ways:

Specify the Amazon Redshift cluster endpoint with the port (for this use case, the driver doesn’t need to run the DescribeClusters API because everything is already available):

jdbc:redshift:iam//<<your redshift cluster endpoint>>/dev

Specify the cluster ID and Region (for this use case, the driver calls the DescribeClusters API and Get-Cluster-Credentials as additional steps):

jdbc:redshift:iam://<<cluster id:region>>/dev

The rest of the parameters are specified in the Extended Properties configuration.

For this walkthrough, you use the second method.

Create a new connection profile in SQL Workbench/J.
Choose Extended Properties.

Provide the following values:
idp_port – 443
plugin_name – com.amazon.redshift.plugin.AdfsCredentialsProvider
ssl_insecure – true
idp_host – The host name for the AD domain, for example, windows2016.adfsredshift.com
preferred_role – The ARN of the SAML role you created in Part 1, for example, arn:aws:iam::<AWSAccountID>:role/ADFZ-Production (replace <AWSAccountID> with your AWS Account)
AutoCreate – true

Setting AutoCreate= True creates the AD user Bob in the Amazon Redshift database the first time without password settings. The authentication is temporary using the SAML configuration you completed in Part 1. You can also set the AutoCreate property with IdP configuration. For more information, see JDBC and ODBC Options for Creating Database User Credentials.

Choose Test.

You should see a successful connection, and can start using SQL Workbench/J to query Amazon Redshift cluster as user Bob. The message indicates Connection to endpoint successful. For instructions on addressing connection errors, see Troubleshooting connection issues in Amazon Redshift.

With this connection profile, run the following query to test SAML authentication and query privilege control:

select * from stv_sessions;

The following screenshot shows the output.

The output shows that the user bob@adfsredshift.com was authenticated using AD FS. The user also joined the marketing group as enforced by the AD FS DbGroups claim rule and the policy associated with the ADFZ-Production role, which the user assumes during this session.

To run the query against the marketing schema, enter the following code:

select * from marketing.employee;

The following screenshot shows the output.

The output shows that AD user Bob is part of the AD group RSDB-marketing, which was mapped to the DB group marketing. This database group had select access to the schema marketing and all tables in that schema. Therefore, the user could successfully query the table through SQL Workbench/J.

To run a query against the sales schema, enter the following code:

select * from sales.employee_sales;

The following screenshot shows the output.

The output shows that Bob is only part of the AD group RSDB-marketing. Due to the way the claim rule is set up, Bob doesn’t have access to the database group sales, and therefore the query returns with a permission denied error.

Cleaning up

To avoid incurring future charges, you can delete the resources by deleting the AWS CloudFormation stack. This cleans up all the resources from your AWS account that you set up in Part 1.

To delete the Amazon Redshift cluster you created, select the cluster and choose Delete from the Actions drop-down menu.

You’re prompted to take a final snapshot if you require the cluster for future testing or setup. Storing the snapshot incurs additional charges.

Conclusion

In the first of this two-part series, you simulated an on-premises AD and AD FS setup using a Windows 2016 Amazon Elastic Compute Cloud (Amazon EC2) instance, and tested authentication using a browser.

In this post, you set up an Amazon Redshift cluster and SQL Workbench/J client with an Amazon Redshift JDBC driver. You connected to the database cluster using a database user with administrator privileges to set up a test database environment, and used a federated user authentication with AD FS. In addition, you validated the configuration by running several queries to see how to control access to Amazon Redshift database objects by using groups and assigning users to specific groups using AD FS seamlessly.

About the Authors

Rajesh Francis is a Data Warehouse Specialist Solutions Architect at AWS. He specializes in Amazon Redshift and works with customers to build scalable Analytic solutions.

Vaibhav Agrawal is an Analytics Specialist Solutions Architect at AWS. Throughout his career, he has focused on helping customers design and build well-architected analytics and decision support platforms.

AWS Big Data Blog

Federate access to your Amazon Redshift cluster with Active Directory Federation Services (AD FS): Part 2

Overview of solution

Walkthrough

Prerequisites

Setting up your Amazon Redshift cluster

Configuring your cluster security group

Configuring and testing connection methods using SQL Workbench/J and Amazon Redshift JDBC driver

Configuring your Amazon Redshift connection profile with SQL Workbench/J

Connecting using the superuser

Connect using federated authentication

Cleaning up

Conclusion

About the Authors

Resources

Follow