AWS Big Data Blog

Achieve finer-grained data security with column-level access control in Amazon Redshift

September 2023: This post was reviewed for accuracy.

Amazon Redshift is the most popular cloud data warehouse because it provides fast insights at a low cost. Customers can confidently run mission critical workloads, even in highly regulated industries, because Amazon Redshift comes with out of the box security and compliance. The security features, combined with the ability to easily analyze data in-place and in open formats, along with compute and storage elasticity, and ease of use are what makes tens of thousands of customers choose Amazon Redshift.

Many organizations store sensitive data, commonly classified as personally identifiable information (PII) or sensitive personal information (SPI) in Amazon Redshift and this data will have restricted access from different persona in the organization. For example, your human resources, finance, sales, data science, and marketing departments may all have the required access privileges to view customer data, whereas only the finance department should have access to sensitive data like personally identifiable information (PII) or payment card industry (PCI).

Views or AWS Lake Formation on Amazon Redshift Spectrum was used previously to manage such scenarios, however this adds extra overhead in creating and maintaining views or Amazon Redshift Spectrum. View based approach is also difficult to scale and can lead to lack of security controls. Amazon Redshift column-level access control is a new feature that supports access control at a column-level for data in Amazon Redshift. You can use column-level GRANT and REVOKE statements to help meet your security and compliance needs similar to managing any database object.

This post shows you how to setup Amazon Redshift column-level access control on table, view and materialized view.

Use Case

There are two tables that store customer demographic and account balance data. Finance department can see all customer data while Sales department can only view and update market segment and account balance data as the rest of customer demographic data like customer name, phone and nation are considered PII data and should have restricted access. This is a good use case for column-level access control to secure the PII data. Below is a simple entity relation diagram for the 2 tables.

Prerequisites

Before trying out the illustration in this blog, note the following prerequisites:

  1. Amazon Redshift cluster.
  2. Database user with permission to create table or superuser.

Setting up the environment

To setup the environment and implement the use case, complete the following steps:

  1. Connect to your Amazon Redshift cluster using any SQL client of your choice with user with permission to create table or superuser.
  2. Create two tables with the following code:

    CREATE TABLE customer 
    (
      customerid       INT8 NOT NULL,
      customername     VARCHAR(25) NOT NULL,
      phone            CHAR(15) NOT NULL,
      nationid        INT4 NOT NULL,
      marketsegment    CHAR(10) NOT NULL,
      accountbalance   NUMERIC(12,2) NOT NULL
    );
    CREATE TABLE nation 
    (
      nationid    INT4 NOT NULL,
      nationname   CHAR(25) NOT NULL
    );
  3. Populate some sample data into the two tables with the following code:

    INSERT INTO customer VALUES
    (1, 'Customer#000000001', '33-687-542-7601', 3, 'HOUSEHOLD', 2788.52),
    (2, 'Customer#000000002', '13-806-545-9701', 1, 'MACHINERY', 591.98),
    (3, 'Customer#000000003', '13-312-472-8245', 1, 'HOUSEHOLD', 3332.02),
    (4, 'Customer#000000004', '23-127-851-8031', 2, 'MACHINERY', 9255.67),
    (5, 'Customer#000000005', '13-137-193-2709', 1, 'BUILDING', 5679.84)
    ;
    INSERT INTO nation VALUES
    (1, 'UNITED STATES'),
    (2, 'AUSTRALIA'),
    (3, 'UNITED KINGDOM')
    ;
  4. Create a view and a materialized view with the following code:

    CREATE OR REPLACE VIEW customer_vw AS SELECT customername, phone, marketsegment, accountbalance, CASE WHEN accountbalance < 1000 THEN 'low' WHEN accountbalance > 1000 AND accountbalance < 5000 THEN 'middle' ELSE 'high' END AS incomegroup FROM customer;
    
    CREATE MATERIALIZED VIEW customernation_mv AS SELECT customername, phone, nationname, marketsegment, sum(accountbalance) AS accountbalance FROM customer c INNER JOIN nation n ON c.nationid = n.nationid GROUP BY customername, phone, nationname, marketsegment;
  5. The purpose of the view, customer_vw is to implement business rule of customer income group categorization based on customer dataset.
  6. Analytical dashboards frequently access this dataset by joining and aggregating tables customer and nation and thus, the materialized view customernation_mv is created to speed up the performance such query significantly.
  7. Create and grant table level permissions to user finance which represent finance department users. Note that below users are created only for illustration purpose. We recommend you to use AWS IAM Federation to bring your corporate users without creating them manually in Amazon Redshift. For more information, please refer to https://docs.aws.amazon.com/redshift/latest/mgmt/redshift-iam-authentication-access-control.html#authentication.
    CREATE USER finance WITH password 'Abcd1234!';
    CREATE USER sales WITH password 'Abcd1234!'; 
    GRANT SELECT, UPDATE ON customer TO finance;
    GRANT SELECT ON customer_vw TO finance;
    GRANT SELECT ON customernation_mv TO finance;
  8. Note that user finance has SELECT and UPDATE permission on all columns on customer table.
  9. You need to test and validate user finance is able to view all data from the customer table, customer_vw view and customernation_mv materialized view and update data on customer table.
  10. Enter the following code:
    SET SESSION AUTHORIZATION 'finance';
    SELECT CURRENT_USER;
    SELECT * FROM customer;
    SELECT * FROM customer_vw;
    SELECT * FROM customernation_mv;
    UPDATE customer SET accountbalance = 1000 WHERE marketsegment = 'BUILDING';
    RESET SESSION AUTHORIZATION;

    Note that SQL statement SET SESSION AUTHORIZATION 'finance' is used to impersonate user finance in above code.

    Each select statement should return five rows and the update statement should return one row updated. See the following code:

    dev=# SET SESSION AUTHORIZATION 'finance';
    SET
    dev=> SELECT CURRENT_USER;
     current_user 
    --------------
     finance
    (1 row)
     
    dev=> SELECT * FROM customer;
     customerid |    customername    |      phone      | nationid | marketsegment | accountbalance 
    ------------+--------------------+-----------------+----------+---------------+----------------
              1 | Customer#000000001 | 33-687-542-7601 |        3 | HOUSEHOLD     |        2788.52
              2 | Customer#000000002 | 13-806-545-9701 |        1 | MACHINERY     |         591.98
              3 | Customer#000000003 | 13-312-472-8245 |        1 | HOUSEHOLD     |        3332.02
              4 | Customer#000000004 | 23-127-851-8031 |        2 | MACHINERY     |        9255.67
              5 | Customer#000000005 | 13-137-193-2709 |        1 | BUILDING      |        5679.84
    (5 rows)
     
    dev=> SELECT * FROM customer_vw;
        customername    |      phone      | marketsegment | accountbalance | incomegroup 
    --------------------+-----------------+---------------+----------------+-------------
     Customer#000000001 | 33-687-542-7601 | HOUSEHOLD     |        2788.52 | middle
     Customer#000000002 | 13-806-545-9701 | MACHINERY     |         591.98 | low
     Customer#000000003 | 13-312-472-8245 | HOUSEHOLD     |        3332.02 | middle
     Customer#000000004 | 23-127-851-8031 | MACHINERY     |        9255.67 | high
     Customer#000000005 | 13-137-193-2709 | BUILDING      |        5679.84 | high
    (5 rows)
     
    dev=> SELECT * FROM customernation_mv;
        customername    |      phone      |        nationname         | marketsegment | accountbalance 
    --------------------+-----------------+---------------------------+---------------+----------------
     Customer#000000005 | 13-137-193-2709 | UNITED STATES             | BUILDING      |        5679.84
     Customer#000000004 | 23-127-851-8031 | AUSTRALIA                 | MACHINERY     |        9255.67
     Customer#000000003 | 13-312-472-8245 | UNITED STATES             | HOUSEHOLD     |        3332.02
     Customer#000000002 | 13-806-545-9701 | UNITED STATES             | MACHINERY     |         591.98
     Customer#000000001 | 33-687-542-7601 | UNITED KINGDOM            | HOUSEHOLD     |        2788.52
    (5 rows)
     
    dev=> UPDATE customer SET accountbalance = 1000 WHERE marketsegment = 'BUILDING';
    UPDATE 1
    dev=>  
    dev=> RESET SESSION AUTHORIZATION;
    RESET

You have now successfully setup table level permissions for user finance to view and update all customer data.

Setting up Amazon Redshift column-level access control

Column-level access control can be enabled and disabled by using GRANT and REVOKE statements with the following syntax:

GRANT { { SELECT | UPDATE } ( column_name [, ...] ) [, ...] | ALL [ PRIVILEGES ] ( column_name [,...] ) }
ON { [ TABLE ] table_name [, ...] }
TO { username | GROUP group_name | PUBLIC } [, ...]
 
REVOKE { { SELECT | UPDATE } ( column_name [, ...] ) [, ...] | ALL [ PRIVILEGES ] ( column_name [,...] ) }
ON { [ TABLE ] table_name [, ...] }
FROM { username | GROUP group_name | PUBLIC } [, ...]
[ CASCADE | RESTRICT ]

To set up column-level privileges, complete the following steps:

  1. To determine which users have column-level access control, you can query PG_ATTRIBUTE_INFO system view. Enter the following code:
    SELECT b.attacl, b.attname, c.relname FROM pg_catalog.pg_attribute_info b JOIN pg_class c ON c.oid=b.attrelid WHERE c.relname in ('customer','customer_vw','customernation_mv') AND b.attacl IS NOT NULL ORDER BY c.relname, b.attname;
  2. The query should return zero records as we have not implemented column-level access control yet.
  3. Grant user sales SELECT permission on columns marketsegment and accountbalance on table customer, view customer_vw and materialized view customernation_mv. We also grant UPDATE permission on column marketsegment and accountbalance on table customer by entering the following code:
    RESET SESSION AUTHORIZATION;
    GRANT SELECT (marketsegment, accountbalance) ON customer TO sales WITH GRANT OPTION;
    GRANT SELECT (marketsegment, accountbalance),UPDATE (marketsegment, accountbalance) ON customer TO sales;
    GRANT SELECT (marketsegment, accountbalance) ON customer_vw TO sales;
    GRANT SELECT (marketsegment, accountbalance) ON customernation_mv TO sales;
  4. Error message “Grant options are not supported for column privileges” should be returned for the first statement. This is because only a table’s owner or a superuser can grant column-level privileges and to maintain simple security model.
  5. Validate if above permissions have been granted with the following code:
    SELECT b.attacl, b.attname, c.relname FROM pg_catalog.pg_attribute_info b  JOIN pg_class c ON c.oid=b.attrelid WHERE c.relname in ('customer','customer_vw','customernation_mv') AND b.attacl IS NOT NULL ORDER BY c.relname, b.attname;
  6. The query should return six rows. See the following code:
    dev=# SELECT b.attacl, b.attname, c.relname FROM pg_catalog.pg_attribute_info b  JOIN pg_class c ON c.oid=b.attrelid WHERE c.relname in ('customer','customer_vw','customernation_mv') AND b.attacl IS NOT NULL ORDER BY c.relname, b.attname;
          attacl       |    attname     |      relname      
    -------------------+----------------+-------------------
     {sales=rw/fqdemo} | accountbalance | customer
     {sales=rw/fqdemo} | marketsegment  | customer
     {sales=r/fqdemo}  | accountbalance | customer_vw
     {sales=r/fqdemo}  | marketsegment  | customer_vw
     {sales=r/fqdemo}  | accountbalance | customernation_mv
     {sales=r/fqdemo}  | marketsegment  | customernation_mv
    (6 rows)

    The output above shows:
    Users: sales (attacl column)
    Permissions: read/write (attacl column value “rw”)
    On Column: accountbalance, marketsegment (attname column)
    Of table: customer (relname column)
    Granted by: fqdemo (attacl column)

    Users: sales (attacl column)
    Permissions: read (attacl column value “r”)
    On Column: accountbalance, marketsegment (attname column)
    Of table: customer_vw, customernation_mv (relname column)
    Granted by: fqdemo (attacl column)

  7. After you confirmed the column-level access control are correct, run as user sales to query table customer, view customer_vw and materialized view customernation_mv using the following code:
    SET SESSION AUTHORIZATION 'sales';
    SELECT CURRENT_USER;
    SELECT * FROM customer;
    SELECT * FROM customer_vw;
    SELECT * FROM customernation_mv;
  8. Each select statement should return permission denied error as the user does not have permissions to all columns of the objects being queried. See the following code:
    dev=# SET SESSION AUTHORIZATION 'sales';
    SET
    dev=> SELECT CURRENT_USER;
     current_user 
    --------------
     sales
    (1 row)
     
    dev=> SELECT * FROM customer;
    ERROR:  permission denied for relation customer
    dev=> SELECT * FROM customer_vw;
    ERROR:  permission denied for relation customer_vw
    dev=> SELECT * FROM customernation_mv;
    ERROR:  permission denied for relation customernation_mv
  9. Query only the columns marketsegment and accountbalance from table customer, view customer_vw and materialized view customernation_mv with the following code:
    SELECT marketsegment, accountbalance FROM customer;
    SELECT marketsegment, accountbalance FROM customer_vw;
    SELECT marketsegment, accountbalance FROM customernation_mv;
  10. Each select statement should return five rows as user sales has permission to query columns marketsegment and accountbalance. See the following code:
    dev=> SELECT marketsegment, accountbalance FROM customer;
     marketsegment | accountbalance 
    ---------------+----------------
     HOUSEHOLD     |        2788.52
     MACHINERY     |         591.98
     HOUSEHOLD     |        3332.02
     MACHINERY     |        9255.67
     BUILDING      |        1000.00
    (5 rows)
     
    dev=> SELECT marketsegment, accountbalance FROM customer_vw;
     marketsegment | accountbalance 
    ---------------+----------------
     HOUSEHOLD     |        2788.52
     MACHINERY     |         591.98
     HOUSEHOLD     |        3332.02
     MACHINERY     |        9255.67
     BUILDING      |        1000.00
    (5 rows)
     
    dev=> SELECT marketsegment, accountbalance FROM customernation_mv;
     marketsegment | accountbalance 
    ---------------+----------------
     MACHINERY     |        9255.67
     BUILDING      |        5679.84
     MACHINERY     |         591.98
     HOUSEHOLD     |        3332.02
     HOUSEHOLD     |        2788.52
    (5 rows)
  11. Update the accountbalance column with the following code:
    UPDATE customer SET accountbalance = 2000 WHERE marketsegment = 'BUILDING';
    SELECT accountbalance FROM customer WHERE marketsegment = 'BUILDING';
  12. The select statement should return one row that shows value 2000. See the following code:
    dev=> UPDATE customer SET accountbalance = 2000 WHERE marketsegment = 'BUILDING';
    UPDATE 1
    dev=> SELECT accountbalance FROM customer WHERE marketsegment = 'BUILDING';
     accountbalance 
    ----------------
            2000.00
    (1 row)
  13. Update the accountbalance column with condition nationid=1 by using the following code:
    UPDATE customer SET accountbalance = 3000 WHERE nationid = 1;
  14. The update statement should return permission denied error as user sales does not have column-level privileges on column nationid in the where clause.
  15. Query the count of record group by nationid with the following code:
    SELECT COUNT(*) FROM customer GROUP BY nationid;
  16. The select statement should return permission denied error as user sales doesn’t have column-level privileges on column nationid in the group by clause.
  17. Please also note that column-level privileges are checked for columns not only in the select list but also where clause, order by clause, group by clause, having clause and other clauses of a query that require SELECT/UPDATE privileges on a column.
  18. Remove column marketsegment from column-level access control for user sales using REVOKE command and see what happens. Enter the following code:
    RESET SESSION AUTHORIZATION;
    REVOKE SELECT (marketsegment) ON customer FROM sales;
    SET SESSION AUTHORIZATION 'sales';
    SELECT CURRENT_USER;
    SELECT marketsegment, accountbalance FROM customer;
    SELECT accountbalance FROM customer;
  19. As you can see, user sales is no longer able to view marketsegment from table customer.
    dev=> SELECT marketsegment, accountbalance FROM customer;
    ERROR:  permission denied for relation customer
    dev=> SELECT accountbalance FROM customer;
     accountbalance 
    ----------------
            2788.52
             591.98
            3332.02
            9255.67
            2000.00
    (5 rows)
  20. Enter the following code to query column marketsegment from view customer_vw:
    SELECT marketsegment FROM customer_vw;
  21. The statement should return five rows as user sales still has access to column marketsegment on the view even though column-level privileges have been revoked from table customer. Views execute with the permissions of the view owner so it will still continue to work as long as the view’s owner still has column or table-level privileges on the base tables used by the view. To prevent unauthorized access of the sensitive data, the column-level privileges for user sales should be revoked from the view as well.
  22. Revoke all permissions for user sales with the following code:
    RESET SESSION AUTHORIZATION;
    SELECT CURRENT_USER;
    REVOKE SELECT ON customernation_mv FROM sales;
    REVOKE SELECT ON customer_vw FROM sales;
    REVOKE SELECT ON customer FROM sales;
    REVOKE UPDATE ON customer FROM sales;
  23. Query the table, view and materialized view again with user sales using the following code:
    SET SESSION AUTHORIZATION 'sales';
    SELECT CURRENT_USER;
    SELECT marketsegment, accountbalance FROM customer;
    SELECT marketsegment, accountbalance FROM customer_vw;
    SELECT marketsegment, accountbalance FROM customernation_mv;
  24. Permission denied error should be returned and this shows that REVOKE is able to remove all permissions.

As summary, a simple GRANT statement will enable column-level access control on Amazon Redshift table, view and materialized view. A REVOKE statement is what you need to remove the permission. This eliminates the complexity of legacy views-based access control to achieve fine-grained read and write access control.

Clean up

Once you are done with above testing, you can remove the objects and users with the following code:

RESET SESSION AUTHORIZATION;
REVOKE SELECT ON customernation_mv FROM finance;
REVOKE SELECT ON customer_vw FROM finance;
REVOKE SELECT ON customer FROM finance;
REVOKE UPDATE ON customer FROM finance;
DROP VIEW customer_vw;
DROP MATERIALIZED VIEW customernation_mv;
DROP TABLE nation;
DROP TABLE customer;
DROP USER IF EXISTS sales;
DROP USER IF EXISTS finance;

Summary

Amazon Redshift is secure by default and security doesn’t cost extra. It provides Authentication (Active Directory, Okta, Ping Federate, and Azure AD), Federation and comes pre-integrated with AWS IAM and KMS. It also supports table-based access control for data in Amazon Redshift and column-level access control for data in Amazon S3 through Amazon Redshift Spectrum since September 2019. Amazon Redshift now supports access control at a column-level for local tables, eliminating the need to implement view-based access control or using another system.

This post showed you how easy it is to setup Amazon Redshift column-level access control. The use case in this post demonstrated how to confirm that you have fine-grained access on the table, view, and materialized view. You can adopt this feature to support your business needs.

If you have any questions or suggestions, please leave a comment below.

 


About the Authors

BP Yau is a Data Warehouse Specialist Solutions Architect at AWS. His role is to help customers architect big data solutions to process data at scale. Before AWS, he helped Amazon.com Supply Chain Optimization Technologies migrate the Oracle Data Warehouse to Amazon Redshift and built the next generation big data analytics platform using AWS technologies.

Srikanth Sopirala is a Sr. Specialist Solutions Architect focused on Analytics at AWS. He is passionate about helping customers build scalable data and analytics solutions in the cloud.


Audit History

Last reviewed in September 2023 by Benita Owoghiri | ACCELERATION LAB SA,