Fully Managed Data Access Governance in Amazon Redshift Using Privacera

By Don Bosco Durai, Co-Founder and CTO – Privacera
By Lovelesh Chawla, Director of Solutions Engineering – Privacera
By Ayan Ray, Sr. Partner Solutions Architect – AWS

Privacera

Data is a strategic asset for every organization and companies are using data to drive critical business decisions—like when to offer new product offerings, how to introduce new revenue streams, and ways to earn trust with customers.

Companies are looking to extract more value from their data but often struggle to capture, store, and analyze all of the data they generate. This data can be spread across various data assets within an organization, making it difficult to get insights and drive business decisions.

It’s essential for teams to implement data management around their data assets. Data governance is the combination of people, process, and technology used to manage the availability, usability, integrity, and security of enterprise system data. Effective data governance ensures data is consistent and trustworthy without being misused.

Data governance includes a broad set of capabilities, and the right solution is often dependent upon customer requirements as well what Amazon Web Services (AWS) and non-AWS services an organization already has in place.

Privacera is an AWS Data and Analytics Competency Partner and AWS Marketplace Seller that is a leading provider of unified data access governance solutions. It enables customers to deliver responsible data-powered performance from their ever-expanding data landscape.

PrivaceraCloud provides a unified and holistic way to manage, define, and enforce policies across storage, compute engines, and consumption methods. It’s built on the core attribute-based access control (ABAC) policy model of Apache Ranger, and applies that model to data lakes, relational databases, streaming systems, and more.

Privacera integrates with the AWS Glue metastore, Amazon EMR processing services like Hive, Spark, and Trino, Amazon Athena, and Amazon Redshift.

In this post, we will discuss how Privacera enables data access governance on Amazon Redshift, including fine-grained access control policies for individual, groups, and roles.

About Amazon Redshift

Amazon Redshift is a cloud data warehouse and helps tens of thousands of customers manage analytics at scale. With Redshift, you can easily take a modern data architecture approach by analyzing all of your data across your data warehouse, Amazon Simple Storage Service (Amazon S3) data lake, and operational databases with consistent security and governance policies.

Amazon Redshift also provides fast performance at any scale and delivers 3x better price performance than other cloud data warehouses.

Privacera Solution Overview

For Amazon Redshift, Privacera relies on the PolicySync model and translates Privacera Ranger fine-grained access policy directives into the native permission model of Redshift, while keeping the target in sync with current policies.

A PolicySync connector is an application Privacera runs in a Kubernetes container that pulls policies from Apache Ranger’s policy store. It monitors the environment for a wide variety of events that can change how those policies translate into permissions on the system under management.

The connector continually collects the current set of policies from Privacera, the current set of resources being managed, and the current users, groups, and roles. User activity such as queries is substantially identical to the performance if the system had been manually configured in line with the policies.

The following diagram shows a logical architecture of Amazon Redshift integration with PrivaceraCloud.

Figure 1 – Amazon Redshift integration with PrivaceraCloud architecture.

Prerequisites

Before getting started, you must complete the following prerequisites:

Set up PrivaceraCloud user account: Follow the documentation to set up a PrivaceraCloud user account.
Set up Amazon Redshift cluster: Follow the documentation to set up an Amazon Redshift cluster. Once the cluster has been configured, create a database called sales with a schema called sales and a table called sales_data using the following columns: name, email, ssn, us_phone, address, account_id, zipcode, country. Populate it with sample data.
Allow access for PrivaceraCloud: Follow the documentation to set up rules to allow PrivaceraCloud access to the Amazon Redshift cluster.

Create Amazon Redshift Connection

The first step is to establish connectivity between the Amazon Redshift cluster and Privacera.

Log in to Privacera’s admin web application and navigate to Applications listed under Settings. Locate the Amazon Redshift tile in the list of Available Connections and create a new Amazon Redshift connection.

Toggle the Access Management button and provide configuration details of the Redshift cluster. Enable policy enforcements and user/group/role management. This allows Privacera to manage users, groups, role, and access control for the Redshift target application. Enable access audits to turn on audit logging.

Figure 2 – Sample Amazon Redshift application configuration.

Create Groups

After you have registered the Amazon Redshift connection, the next step is to create user groups. We’ll create two new user groups—Accounting and Sales—and we’ll need to navigate to Users/Groups/Roles listed under Access Management. Hit the Add button and name the group “Sales.”

Similarly, create a second group and call it Accounting. Once the groups are created, we’ll need to create users and place them in the proper groups.

Figure 3 – New group creation.

Create Users

To create users in PrivaceraCloud, navigate to Access Management and then to Users/Groups/Roles. Click the Users tab and the Add button.

Create a user named “Emily.”

Figure 4 – New user creation.

Similarly, create two more users named “Nick” and “Ed.”

Once the users are created, we’ll need to add a custom attribute. By editing the user Emily and choosing the attribute tab. Add UK to the country attribute.

Figure 5 – Custom attribute assignment to the user.

Add an attribute named Country with the value as UK for user Nick. Once the users and groups are created and the attributes are set, we are ready to create policies in Privacera.

Create Resource-Based Policy to Allow Access

Navigate to Resource Policies under Access Manager on the left panel. Open the privacera_redshift tile and add a new policy. Provide an appropriate Policy Name and limit the scope of the policy to the schema (sales) within a particular database (sales_data).

For a resource-based policy, first specify the resources to be included in the policy, and then identify one or more sets of principals (roles, groups, or users) and the permissions to grant them.

Name the policy “access to sales schema” and populate the Policy Detail section as follows.

Select the drop-down shown and select “database” and type the database name “sales.” You can add multiple values here if you have multiple databases you want to address with this policy.

Next, add the schema name “sales.” Table and column options become available, but these both can be set to wildcards ”*”. Scroll down to the allow conditions and grant user Ed full access to the data in the schema. Then, select permissions and add the relevant permissions for Ed.

Create a second allow condition and allow both the sales and accounting group select permission.

Figure 6 – Resource-based policy.

You can navigate to PolicySync under Access Manager in the left panel to observe the series of SQLs that were executed on the database.

Privacera enforces the policy in Amazon Redshift by creating a view called a “secure view.” Most users are only granted access through secure views that encapsulate access control logic, which includes column masking and row filtering.

By default, secure views are named by adding the suffix “_secure” to the table name. Only principals who are granted Data Admin permission can directly access base tables.

At this point, Ed has data “administrator” access so he will be able to access base tables and the secure views.

Both Nick and Emily only have “select” permission and will only be granted access to the secure views. They will not have access to the underlying tables.

Create Policy to Mask Sensitive Columns

Privacera can mask certain columns to protect sensitive information. In this example, the table sales has personally identifiable information (PII) about users including name, email, and phone. You can restrict mask this information from the sales group.

Navigate to Masking under privacera_redshift and add a new policy. Provide a meaningful name to the policy. Select the appropriate database, schema, table, and column names. You can choose from a set of built-in masking options or define your own custom masking policy.

Figure 7 – Policy to mask sensitive columns.

The secure views will be updated to incorporate the logic from the masking expression and the concerned principals.

Create Policy to Filter Rows Based on User Attributes

Polices can be created to filter certain rows based on the country attribute we assigned earlier. Navigate to Resource Polices under Access Management and click the row-level filter and add a new policy.

Select the database, schema, and table as before. Under the row-level condition, choose Group public. Set the permission to select, and for row-level permission enter the filter expression as COUNTRY=’${{USER.country}}’. This will use the attribute we created earlier.

Note that this policy does not grant permission to users who don’t already have it from an access policy.

Figure 8 – Policy to filter rows based on user attributes.

Conclusion

In this post, we showed you how to use Privacera to manage access control policies in Amazon Redshift.

More specifically, you learned how to use Privacera to create an Amazon Redshift connection, create new users and user groups, assign custom attributes to user, create a resource-based policy to allow access, create a policy to mask sensitive columns, and create a policy to filter rows based on user attributes.

This integration enables organizations to make data-driven decisions using Amazon Redshift with Privacera doing the data access governance.

You can learn more about Privacera in AWS Marketplace.

.

.

Privacera – AWS Partner Spotlight

Privacera is an AWS Data & Analytics Partner that provides security and privacy tools for enterprises to secure and govern user access to databases and datastores in the cloud.

Contact Privacera | Partner Overview | AWS Marketplace