AWS Big Data Blog

Integrate Amazon Redshift native IdP federation with Microsoft Azure AD and Power BI

Amazon Redshift accelerates your time to insights with fast, easy, and secure cloud data warehousing at scale. Tens of thousands of customers rely on Amazon Redshift to analyze exabytes of data and run complex analytical queries.

As enterprise customers look to build their data warehouse on Amazon Redshift, they have many integration needs with the business intelligence (BI) tools they’re using. For customers who want to integrate Amazon Redshift with their existing identity provider (IdP) such as Microsoft Azure Active Directory (Azure AD) using BI tools and services such as Power BI Desktop and Power BI service, we have introduced a native IdP for Amazon Redshift to help you implement authentication and authorization for these tools in a seamless way.

Amazon Redshift native IdP simplifies the administration process of managing identities and permission. This feature provides native integration with Microsoft Azure AD, which you can use for authentication and authorization with tools like Power BI. It uses your existing IdP to simplify authentication and managing permissions. It does this by making it possible to share identity metadata to Amazon Redshift from your IdP. In this approach, an external IdP (such as Azure AD) issues an access token, which is passed to Amazon Redshift via a client, and then Amazon Redshift performs the token validation and claim extraction natively.

This post shows a step-by-step implementation of the Amazon Redshift native IdP setup with Azure AD, which demonstrates how to manage users and groups with an organizational directory, and how to federate into Amazon Redshift. You don’t need to create AWS Identity and Access Management (IAM) roles, policies, separate database users, or groups in Amazon Redshift with this setup.

Solution overview

Using an Amazon Redshift native IdP has the following benefits:

  • You can manage users and groups from a centralized IdP
  • Enables your users to be automatically signed in to Amazon Redshift with their Azure AD accounts
  • You can automatically create Amazon Redshift roles with a namespace that represents external groups (such as Azure AD groups)
  • External user group membership is natively mirrored with Amazon Redshift roles and users

The general configuration steps of the Amazon Redshift native IdP approach are as follows:

  1. Register an application in Azure AD and set up groups.
  2. Collect Azure AD information for the Amazon Redshift IdP.
  3. Set up the IdP on Amazon Redshift.
  4. Set up Amazon Redshift permissions to external identities.
  5. Configure the client connection.

The following diagram illustrates the resulting solution.

To get authorized, the Power BI client sends an authentication request to the Azure enterprise application using Azure AD credentials. After verification, Azure sends a JSON web token (OAuth token) to the Power BI application. The Power BI application forwards the connection string with the OAuth token to Amazon Redshift. Amazon Redshift parses and validates the token, and requests group information from Azure AD. Upon reception, Amazon Redshift automatically creates the user and roles, and does the respective mapping.

Prerequisites

You need the following prerequisites to set up this solution:

  • A Microsoft Azure account that has an active subscription. You need to an admin role to set up the application on Azure AD.
  • Power BI Desktop version 2.102.683.0 64-bit and above downloaded and installed. In this example, we have use a Windows environment.
  • The latest version of the Microsoft Enterprise/Standard Gateway installed.
  • An AWS account with an Amazon Redshift cluster. In this post, we connect Power BI Desktop and service with a publicly accessible Amazon Redshift cluster.

Register an application in Azure AD and set up groups

To set up the Azure application and group permission, complete the following steps:

  1. Sign in to the Azure portal with your Microsoft account.
  2. Navigate to the Azure Active Directory application.
  3. Under Manage, choose App registrations and New registration.
  4. For Name, enter an application name (for example, nativeoauthsetup).
  5. Keep the default settings for the rest of the fields.
  6. Choose Register to complete the initial application registration.
  7. On the newly created application Overview page, locate the client ID and tenant ID and note down these IDs in order to register the IdP in Amazon Redshift later.
  8. Under Manage in the navigation pane, choose API permissions.
  9. Choose Add a permission.
  10. Choose Microsoft Graph and then choose Application permissions.
  11. Search for directory and select the Directory.Read.All permission.
  12. Choose Add permissions.
  13. Choose Grant admin consent.
  14. In the popup box, choose Yes to grant the admin consent.

The status of the permission shows Granted for with a green check mark.

  1. Under Manage in the navigation pane, choose Certificates & secrets.
  2. Choose Client secrets and choose New client secret.
  3. Enter a Description, select an expiration for the secret or specify a custom lifetime. We are keeping Microsoft recommended default expiration value of 6 months. Choose Add.
  4. Copy the secret value.

It would only be present one time and after that you cannot read it.

  1. On the Azure AD home page, under Manage in the navigation pane, choose Groups.
  2. Choose New group.
  3. In the New Group section, provide the required information.
  4. Choose No members selected and then search for the members.
  5. Select your members and choose Select. For this example, you can search your username and click select.

You can see the number of members in the Members section.

  1. Choose Create.

Collect Azure AD Information for Amazon Redshift IdP

Before we collect the Azure AD information, we need to identify the access token version from the application which you have created earlier. In the navigation pane, under Manage, choose Manifest section, then view the accessTokenAcceptedVersion parameter: null and 1 indicate v1.0 tokens, and 2 indicates v2.0 tokens.

To configure your IdP in Amazon Redshift, collect the following parameters from Azure AD. If don’t have these parameters, contact your Azure admin.

  • issuer – This is known as <Microsoft_Azure_issuer_value>. If you’re using the v1.0 token, use https://sts.windows.net/<Microsoft_Azure_tenantid_value>/. Currently, Power BI only uses v1.0 token. If you’re using the v2.0 token, use https://login.microsoftonline.com/<Microsoft_Azure_tenantid_value>/v2.0. To find your Microsoft Azure tenant ID, complete the following steps:
    • Sign in to the Azure portal with your Microsoft account.
    • Under Manage, choose App registrations.
    • Choose the Amazon Redshift application you created earlier.
    • Click on the Overview (left panel) page and under Essentials, note down the values.
  • client_id – This is known as <Microsoft_Azure_clientid_value> in the following sections. An example of a client ID is 5ab12345-1234-1a12-123a-11abc1a12ab1). To get your client ID value, locate the Amazon Redshift application you created earlier on the Azure portal; it’s listed in the Essentials section.
  • client_secret – This is known as <Microsoft_Azure_client_secret_value> in the following sections. An example of a client secret value is KiG7Q~FEDnE.VsWS1IIl7LV1R2BtA4qVv2ixB). To create your client secret value, refer to the steps in the previous section.
  • audience – This is known as <Microsoft_Azure_token_audience_value> in the following sections. With Power BI Desktop, you need to set the audience value as https://analysis.windows.net/powerbi/connector/AmazonRedshift.

Set up the IdP on Amazon Redshift

To set up the IdP on Amazon Redshift, complete the following steps:

  1. Log in to Amazon Redshift with a superuser user name and password using query editor v2 or any SQL client.
  2. Run the following SQL:
    CREATE IDENTITY PROVIDER <idp_name> TYPE azure 
    NAMESPACE '<namespace_name>' 
    PARAMETERS '{ 
    "issuer":"<Microsoft_Azure_issuer_value>", 
    "audience":["<Microsoft_Azure_token_audience_value>"],
    "client_id":"<Microsoft_Azure_clientid_value>", 
    "client_secret":"<Microsoft_Azure_client_secret_value>"
    }';

In our example, we use the v1.0 token issuer because as of this writing, Power BI only uses the v1.0 token:

CREATE IDENTITY PROVIDER oauth_standard TYPE azure
NAMESPACE 'aad'
PARAMETERS '{
"issuer":"https://sts.windows.net/e12b1bb1-1234-12ab-abc1-1ab012345a12/",
"audience":["https://analysis.windows.net/powerbi/connector/AmazonRedshift"],
"client_id":"5ab12345-1234-1a12-123a-11abc1a12ab1",
"client_secret":"KiG7Q~FEDnE.VsWS1IIl7LV1R2BtA4qVv2ixB"
}'
  1. To alter the IdP, use the following command (this new set of parameter values completely replaces the current values):
    ALTER IDENTITY PROVIDER <idp_name> PARAMETERS 
    '{
    "issuer":"<Microsoft_Azure_issuer_value>",
    "audience":["<Microsoft_Azure_token_audience_value>"], 
    "client_id":"<Microsoft_Azure_clientid_value>", 
    "client_secret":"<Microsoft_Azure_client_secret_value>"
    }';
  2. To view a single registered IdP in the cluster, use the following code:
    DESC IDENTITY PROVIDER <idp_name>;
  3. To view all registered IdPs in the cluster, use the following code:
    select * from svv_identity_providers;
  4. To drop the IdP, use the following command:
    DROP IDENTITY PROVIDER <idp_name> [CASCADE];

Set up Amazon Redshift permissions to external identities

The users, roles, and role assignments are automatically created in your Amazon Redshift cluster during the first login using your native IdP unless they were manually created earlier.

Create and assign permission to Amazon Redshift roles

In this step, we create a role in the Amazon Redshift cluster based on the groups that you created on the Azure AD portal.

The role name in the Amazon Redshift cluster looks like <namespace>:<azure_ad_group_name>, where the namespace is the one we provided in the IdP creation command and the group name is the one we specified when we were setting up the Azure application. In our example, it’s aad:rsgroup.

Run the following command in the Amazon Redshift cluster:

create role "<namespace_name>:<Azure AD groupname>";

For example:

create role "aad:rsgroup";

To grant permission to the Amazon Redshift role, enter the following command:

GRANT { { SELECT | INSERT | UPDATE | DELETE | DROP | REFERENCES } [,...]
 | ALL [ PRIVILEGES ] }
ON { [ TABLE ] table_name [, ...] | ALL TABLES IN SCHEMA schema_name [, ...] }
TO role "<namespace_name>:<Azure AD groupname>";

Then grant relevant permission to the role as per your requirement. For example:

grant select on all tables in schema public to role "aad:rsgroup";

Create and assign permission to an Amazon Redshift user

This step is only required if you want to grant permission to an Amazon Redshift user instead of roles. We create an Amazon Redshift user that maps to a Azure AD user and then grant permission to it. If you don’t want to explicitly assign permission to an Amazon Redshift user, you can skip this step.

To create the user, use the following syntax:

CREATE USER "<namespace_name>:<Azure AD username>" PASSWORD DISABLE;

For example:

CREATE USER "aad:john@azent.org" PASSWORD DISABLE;

We use the following syntax to grant permission to the Amazon Redshift user:

GRANT { { SELECT | INSERT | UPDATE | DELETE | DROP | REFERENCES } [,...]
 | ALL [ PRIVILEGES ] }
ON { [ TABLE ] table_name [, ...] | ALL TABLES IN SCHEMA schema_name [, ...] }
TO "<namespace_name>:<Azure AD username>";

For example:

grant select on all tables in schema public to "aad:john@azent.org"

Configure your client connection using an Amazon Redshift native IdP

In this section, we provide instructions to set up your client connection for either Power BI Desktop or the Power BI service.

Connect Power BI Desktop

In this example, we use Power BI Desktop to connect with Amazon Redshift using a native IdP. Use Power BI Desktop version: 2.102.683.0 64-bit and above.

  1. In your Power BI Desktop, choose Get data.
  2. Search for the Amazon Redshift connector, then choose it and choose Connect.
  3. For Server, enter your Amazon Redshift cluster’s endpoint. For example, test-cluster.ct4abcufthff.us-east-1.redshift.amazonaws.com.
  4. For Database, enter your database name. In this example, we use dev.
  5. Choose OK.
  6. Choose Microsoft Account.
  7. Choose Sign in.
  8. Enter your Microsoft Account credentials.

When you’re connected, you can see the message You are currently signed in.

  1. Choose Connect.

Congratulations! You are signed in using the Amazon Redshift native IdP with Power BI Desktop. Now you can browse your data.

After that, you can create your own Power BI report on the desktop version and publish it to your Microsoft account. For this example, we created and published a report named RedshiftOAuthReport, which I refer to later in this post.

Connect Power BI service

Now, let’s connect a Power BI gateway with Amazon Redshift using a native IdP. Before proceeding with below setup, please make sure you have downloaded and installed the latest version of the Microsoft Enterprise/Standard Gateway.

  1. Open the Power BI web application and sign in if necessary.

You can see the RedshiftOAuthReport report that we created earlier.

  1. In the navigation pane, under Datasets, choose the menu icon (three dots) next to the report name and then choose Settings.
  2. Enable Gateway connection on the settings page.
  3. Click on the arrow on right side and select Manually add to gateway.

  4. In the Data Source Settings section, enter the appropriate values:
    1. For Data Source Name, enter a name.
    2. For Data Source Type, choose Amazon Redshift.
    3. For Server, enter your Amazon Redshift cluster’s endpoint.
    4. For Database, enter your database name (for this post, we use dev).
    5. For Authentication Method, choose OAuth2.
  5. Choose Edit credentials.
  6. In the pop-up box, choose Sign in.
  7. Enter your Microsoft account credentials and follow the authentication process.
  8. After the authentication, choose Add on the Data Source Settings page.
  9. Make sure that Gateway connection is enabled. If not, enable it.
  10. Select your gateway from the gateway list.
  11. On the Maps to menu, choose your data source.
  12. Choose Apply.

Congratulations! You have completed the Amazon Redshift native IdP setup with Power BI web service.

Best Practices with Redshift native IdP:

  • Pre-create the Amazon Redshift roles based upon the groups which you have created on the Azure AD portal.
  • Assign permissions to Redshift roles instead of assigning to each individual external user. This will provide smoother end user experience as user will have all the required permission when they login using native IdP.

Troubleshooting

If your connection didn’t work, consider the following:

  • Enable logging in the driver. For instructions, see Configure logging.
  • Make sure to use the latest Amazon Redshift JDBC driver version 2.1.0.4 onwards, which supports Amazon Redshift native IdP authentication.
  • If you’re getting errors while setting up the application on Azure AD, make sure you have admin access.
  • If you can authenticate via the SQL client but get a permission issue or can’t see objects, grant the relevant permission to the role, as detailed earlier in this post.
  • If you get the error “claim value does not match expected value,” make sure you provided the correct parameters during Amazon Redshift IdP registration.
  • Check stl_error or stl_connection_log views on the Amazon Redshift cluster for authentication failures.

Summary

In this post, we covered the step-by-step process of integrating Amazon Redshift with Azure AD and Power BI Desktop and web service using Amazon Redshift native IdP federation. The process consisted of registering a Azure application, creating Azure AD groups, setting up the Amazon Redshift IdP, creating and assigning permission to Amazon Redshift roles, and finally configuring client connections.

For more information about Amazon Redshift native IdP federation, see:

If you have questions or suggestions, please leave a comment.


About the Authors

Maneesh Sharma is a Senior Database Engineer at AWS with more than a decade of experience designing and implementing large-scale data warehouse and analytics solutions. He collaborates with various Amazon Redshift Partners and customers to drive better integration.

Ilesh Garish is a Software Development Engineer at AWS. His role is to develop connectors for Amazon Redshift. Prior to AWS, he built database drivers for the Oracle RDBMS, TigerLogic XDMS, and OpenAccess SDK. He also worked in the database internal technologies at San Francisco Bay Area startups.

Debu-PandaDebu Panda is a Senior Manager, Product Management at AWS. He is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world.

Sergey Konoplev is a Senior Database Engineer on the amazon Redshift Team at AWS. Sergey has been focusing on Automation and improvement of database and data operations for more than a decade.