AWS Big Data Blog
Achieve finer-grained data security with column-level access control in Amazon Redshift
September 2023: This post was reviewed for accuracy.
Amazon Redshift is the most popular cloud data warehouse because it provides fast insights at a low cost. Customers can confidently run mission critical workloads, even in highly regulated industries, because Amazon Redshift comes with out of the box security and compliance. The security features, combined with the ability to easily analyze data in-place and in open formats, along with compute and storage elasticity, and ease of use are what makes tens of thousands of customers choose Amazon Redshift.
Many organizations store sensitive data, commonly classified as personally identifiable information (PII) or sensitive personal information (SPI) in Amazon Redshift and this data will have restricted access from different persona in the organization. For example, your human resources, finance, sales, data science, and marketing departments may all have the required access privileges to view customer data, whereas only the finance department should have access to sensitive data like personally identifiable information (PII) or payment card industry (PCI).
Views or AWS Lake Formation on Amazon Redshift Spectrum was used previously to manage such scenarios, however this adds extra overhead in creating and maintaining views or Amazon Redshift Spectrum. View based approach is also difficult to scale and can lead to lack of security controls. Amazon Redshift column-level access control is a new feature that supports access control at a column-level for data in Amazon Redshift. You can use column-level GRANT and REVOKE statements to help meet your security and compliance needs similar to managing any database object.
This post shows you how to setup Amazon Redshift column-level access control on table, view and materialized view.
Use Case
There are two tables that store customer demographic and account balance data. Finance department can see all customer data while Sales department can only view and update market segment and account balance data as the rest of customer demographic data like customer name, phone and nation are considered PII data and should have restricted access. This is a good use case for column-level access control to secure the PII data. Below is a simple entity relation diagram for the 2 tables.
Prerequisites
Before trying out the illustration in this blog, note the following prerequisites:
- Amazon Redshift cluster.
- Database user with permission to create table or superuser.
Setting up the environment
To setup the environment and implement the use case, complete the following steps:
- Connect to your Amazon Redshift cluster using any SQL client of your choice with user with permission to create table or superuser.
- Create two tables with the following code:
- Populate some sample data into the two tables with the following code:
- Create a view and a materialized view with the following code:
- The purpose of the view,
customer_vw
is to implement business rule of customer income group categorization based oncustomer
dataset. - Analytical dashboards frequently access this dataset by joining and aggregating tables
customer
andnation
and thus, the materialized viewcustomernation_mv
is created to speed up the performance such query significantly. - Create and grant table level permissions to user
finance
which represent finance department users. Note that below users are created only for illustration purpose. We recommend you to use AWS IAM Federation to bring your corporate users without creating them manually in Amazon Redshift. For more information, please refer to https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-authentication-access-control.html#authentication. - Note that user
finance
hasSELECT
andUPDATE
permission on all columns oncustomer
table. - You need to test and validate user
finance
is able to view all data from thecustomer
table,customer_vw
view andcustomernation_mv
materialized view and update data oncustomer
table. - Enter the following code:
Note that SQL statement
SET SESSION AUTHORIZATION 'finance'
is used to impersonate userfinance
in above code.Each select statement should return five rows and the update statement should return one row updated. See the following code:
You have now successfully setup table level permissions for user finance
to view and update all customer data.
Setting up Amazon Redshift column-level access control
Column-level access control can be enabled and disabled by using GRANT
and REVOKE
statements with the following syntax:
To set up column-level privileges, complete the following steps:
- To determine which users have column-level access control, you can query PG_ATTRIBUTE_INFO system view. Enter the following code:
- The query should return zero records as we have not implemented column-level access control yet.
- Grant user
sales SELECT
permission on columnsmarketsegment
andaccountbalance
on tablecustomer
, viewcustomer_vw
and materialized viewcustomernation_mv
. We also grantUPDATE
permission on columnmarketsegment
andaccountbalance
on tablecustomer
by entering the following code: - Error message “
Grant options are not supported for column privileges
” should be returned for the first statement. This is because only a table’s owner or a superuser can grant column-level privileges and to maintain simple security model. - Validate if above permissions have been granted with the following code:
- The query should return six rows. See the following code:
The output above shows:
Users:sales
(attacl
column)
Permissions:read/write
(attacl
column value “rw”)
On Column:accountbalance
,marketsegment
(attname
column)
Of table:customer
(relname
column)
Granted by:fqdemo
(attacl
column)Users:
sales
(attacl
column)
Permissions:read
(attacl
column value “r”)
On Column:accountbalance
,marketsegment
(attname
column)
Of table:customer_vw
,customernation_mv
(relname
column)
Granted by:fqdemo
(attacl
column) - After you confirmed the column-level access control are correct, run as user
sales
to query tablecustomer
, viewcustomer_vw
and materialized viewcustomernation_mv
using the following code: - Each select statement should return permission denied error as the user does not have permissions to all columns of the objects being queried. See the following code:
- Query only the columns
marketsegment
andaccountbalance
from tablecustomer
,view customer_vw
and materialized viewcustomernation_mv
with the following code: - Each select statement should return five rows as user
sales
has permission to query columnsmarketsegment
andaccountbalance
. See the following code: - Update the
accountbalance
column with the following code: - The select statement should return one row that shows value 2000. See the following code:
- Update the
accountbalance
column with conditionnationid=1
by using the following code: - The update statement should return permission denied error as user sales does not have column-level privileges on column nationid in the where clause.
- Query the count of record group by nationid with the following code:
- The select statement should return permission denied error as user
sales
doesn’t have column-level privileges on columnnationid
in the group by clause. - Please also note that column-level privileges are checked for columns not only in the select list but also
where
clause,order by
clause,group by
clause,having
clause and other clauses of a query that requireSELECT/UPDATE
privileges on a column. - Remove column
marketsegment
from column-level access control for user sales usingREVOKE
command and see what happens. Enter the following code: - As you can see, user
sales
is no longer able to viewmarketsegment
from table customer. - Enter the following code to query column
marketsegment
from viewcustomer_vw
: - The statement should return five rows as user
sales
still has access to columnmarketsegment
on the view even though column-level privileges have been revoked from tablecustomer
. Views execute with the permissions of the view owner so it will still continue to work as long as the view’s owner still has column or table-level privileges on the base tables used by the view. To prevent unauthorized access of the sensitive data, the column-level privileges for user sales should be revoked from the view as well. - Revoke all permissions for user
sales
with the following code: - Query the table, view and materialized view again with user
sales
using the following code: - Permission denied error should be returned and this shows that
REVOKE
is able to remove all permissions.
As summary, a simple GRANT
statement will enable column-level access control on Amazon Redshift table, view and materialized view. A REVOKE
statement is what you need to remove the permission. This eliminates the complexity of legacy views-based access control to achieve fine-grained read and write access control.
Clean up
Once you are done with above testing, you can remove the objects and users with the following code:
Summary
Amazon Redshift is secure by default and security doesn’t cost extra. It provides Authentication (Active Directory, Okta, Ping Federate, and Azure AD), Federation and comes pre-integrated with AWS IAM and KMS. It also supports table-based access control for data in Amazon Redshift and column-level access control for data in Amazon S3 through Amazon Redshift Spectrum since September 2019. Amazon Redshift now supports access control at a column-level for local tables, eliminating the need to implement view-based access control or using another system.
This post showed you how easy it is to setup Amazon Redshift column-level access control. The use case in this post demonstrated how to confirm that you have fine-grained access on the table, view, and materialized view. You can adopt this feature to support your business needs.
If you have any questions or suggestions, please leave a comment below.
About the Authors
BP Yau is a Data Warehouse Specialist Solutions Architect at AWS. His role is to help customers architect big data solutions to process data at scale. Before AWS, he helped Amazon.com Supply Chain Optimization Technologies migrate the Oracle Data Warehouse to Amazon Redshift and built the next generation big data analytics platform using AWS technologies.
Srikanth Sopirala is a Sr. Specialist Solutions Architect focused on Analytics at AWS. He is passionate about helping customers build scalable data and analytics solutions in the cloud.
Audit History
Last reviewed in September 2023 by Benita Owoghiri | ACCELERATION LAB SA,