AWS Database Blog

Create a unit testing framework for PostgreSQL using the pgTAP extension

In the world of database development, ensuring the reliability and correctness of your code is crucial. Unit testing plays a vital role by allowing developers to validate individual components of database logic, including stored procedures, functions, and triggers.

pgTAP (PostgreSQL Test Anything Protocol) is a unit testing framework that empowers developers to write and run tests directly within the database. It provides native database testing capabilities, enables validation of database functions and queries, and delivers immediate feedback without external dependencies, ensuring code behaves as expected.

In this post, we explore how to leverage the pgTAP extension for unit testing on Amazon Relational Database Service (Amazon RDS) for PostgreSQL and Amazon Aurora PostgreSQL-Compatible Edition database, helping you build robust and reliable database applications.

pgTAP overview

pgTAP is an open-source unit testing framework consisting of a suite of database functions written in PL/pgSQL. It provides a robust set of tools and utilities for writing and running tests against your database code. Inspired by the popular xUnit testing framework, pgTAP follows a similar convention and allows you to write comprehensive tests for database objects such as tables, functions, triggers, and more.

Other testing frameworks and tools are available for PostgreSQL, such as pgUnit and pytest-postgresql. These tools also offer capabilities for testing database functions, queries, and data integrity within a PostgreSQL environment. When comparing pgTAP with these tools, one of its standout features is its comprehensive set of built-in assertion functions.

pgTAP provides powerful features for PostgreSQL environments, streamlining the writing and execution of tests. With seamless integration into PostgreSQL, built-in assertions, and strong extensibility, the benefits of using pgTAP include comprehensive test coverage, native PostgreSQL integration, and automated validation of database components, making it an effective tool for ensuring database code reliability. Supported by an active community, it helps developers maintain robust and reliable databases. In this guide, we’ll explore how pgTAP can enhance your testing workflow.

Benefits of using pgTAP for Unit Testing in Amazon RDS for PostgreSQL or Amazon Aurora PostgreSQL-compatible are:

  • Ensuring data integrity by validating the structure and content of tables
  • Detecting regressions and identifying potential issues early in the development
  • Provides a repeatable and automated testing process, saving time and effort
  • Enabling integration with continuous integration and continuous deployment (CI/CD) pipelines
  • Facilitating collaboration among developers with a standardized testing framework

Solution overview

In the following sections, we demonstrate how to install pgTAP, create a sample database and other objects, and test the pgTAP framework.

Prerequisites

  1. Create an Active database connection using Amazon RDS for PostgreSQL 11 or higher and Amazon Aurora for PostgreSQL 13 or higher.
  2. Install pgTAP by connecting to the above created database as a user with the rds_superuser privilege and run the following code:
    -- Create the "pgtap" extension
    CREATE EXTENSION IF NOT EXISTS pgtap;
    SQL

Create a sample database and test the pgTAP framework.

Connect to your database instance where the pgTAP extension is installed, using a command-line tool or another method, such as pgAdmin, or the psql utility and run the below mentioned command to create the sample objects that are required for testing the pgTAP framework:

  1. Create the Test tables:

    CREATE TABLE users (
        id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
        fname VARCHAR(100),
        age INTEGER,
        email TEXT
    );
    
    CREATE TABLE products (
        id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
        fname VARCHAR(100),
        price DECIMAL(10, 2),
        category VARCHAR(50)
    );
    
    CREATE TABLE price_changes (
        id int GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
        product_id INTEGER,
        old_price DECIMAL(10, 2),
        new_price DECIMAL(10, 2),
        change_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );
    SQL
  2. Populate the tables using INSERT statements:

    -- Bulk Insert for 'users' table 
    INSERT INTO users (fname, age, email) VALUES ('John Doe', 30, 'john.doe@example.com'), ('Jane Smith', 28, 'jane.smith@example.com'), ('Bob Johnson', 35, 'bob.johnson@example.com'); 
    
    -- Bulk Insert for 'products' table 
    INSERT INTO products (fname, price, category) VALUES ('Widget A', 9.99, 'Electronics'), ('Widget B', 14.99, 'Electronics'), ('Gadget C', 24.99, 'Gadgets'); 
    COMMIT;
    SQL
  3. Create sample functions:

    CREATE FUNCTION calculate_total_price (quantity INTEGER, unit_price DECIMAL(10, 2))
    RETURNS DECIMAL(10, 2) AS $$
    BEGIN
        RETURN quantity * unit_price;
    END;
    $$ LANGUAGE plpgsql;
    SQL

    This function can be useful for calculating the total cost of an order or purchase, based on the quantity and unit price of a product. It calculates the total price by multiplying the quantity by the unit_price.

    CREATE FUNCTION validate_email(email TEXT)
    RETURNS BOOLEAN AS $$
    BEGIN
        -- Simple email validation logic
        RETURN (email ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$');
    END;
    $$ LANGUAGE plpgsql;
    SQL

    This function checks if an email address is valid, and can be called before storing an email address in the database. The function returns a Boolean value: TRUE if the email address is valid, FALSE otherwise. We can verify the functionality of the above function using the pgTAP testing framework.

  4. Create sample triggers:

    CREATE or replace FUNCTION log_product_changes()
    RETURNS TRIGGER AS $$
    BEGIN
        IF NEW.price <> OLD.price THEN
            INSERT INTO price_changes (product_id, old_price, new_price)
            VALUES (NEW.id, OLD.price, NEW.price);
        END IF;
        RETURN NEW;
    END;
    $$ LANGUAGE plpgsql;
    SQL

    This trigger can be useful for tracking and logging price changes for products, which can be useful for auditing or historical purposes.

    CREATE or replace TRIGGER product_price_change
    AFTER UPDATE OF price ON products
    FOR EACH ROW
    WHEN (NEW.price IS DISTINCT FROM OLD.price)
    EXECUTE FUNCTION log_product_changes();
    SQL

    This trigger will inserts a new row into the price_changes table, recording the product_id, the old_price, and the new_price whenever new price changes.

    We can verify the functionality of the above Trigger using the pgTAP testing framework.

Test cases with pgTAP

The pgTAP extension provides a wide range of test methods for tablespaces, schemas, tables, columns, views, sequences, indexes, triggers, functions, policies, users, languages, rules, classes, operators, and extensions. We can use PgTAP with 2 different ways:

1. pg_prove

We can test pgTAP with pg_prove also and it is a command-line tool that allows you to run multiple pgTAP tests in bulk with a single command. With pg_prove, you can automate the execution of your pg_tap tests. You can write your test scripts as SQL scripts and store them in a directory. pg_prove can then run all the tests located in that directory.

The output of the tests is harvested and processed by TAP::Harness in order to summarize the results of the test.

To utilize pg_prove, follow these steps:

Navigate to the folder containing your test scripts, Run the following command, replacing [database name] and [hostname] with the name of your database and host name:

pg_prove -h host.rds.amazonaws.com -p 5432 -d databaseName -U postgres -v path/to/test/directory/*.sql  > results.txt
Bash

This command executes all the test scripts present in the specified directory against the provided database.

By using pg_prove, you can streamline the process of running multiple pgTAP tests, ensuring that your database functionality is thoroughly tested and validated.

2. Command line tool or client utility

We can use a few of the tools like PgAdmin or psql for running the test cases, ensure that you’re connecting from the same AWS Region where your database is hosted.

Running through few pgTAP test cases

1. Table existence test

This test case verifies that a specific table exists in the database and if the user has SELECT privileges on it, and also that the function and trigger exist.

In the current test scenario, we will utilize the pg_prove utility. We will create a test SQL file either on the local machine or on a storage value (such as Amazon EBS), from which the database can be accessed and connected.

The following is our test.sql file, which can be created using any text editor or document editing tool:

\unset ECHO
\set QUIET 1
-- Turn off echo and keep things quiet.

-- Format the output for nice TAP.
\pset format unaligned
\pset tuples_only true
\pset pager off

-- Revert all changes on failure.
\set ON_ERROR_ROLLBACK 1
\set ON_ERROR_STOP true

BEGIN;

-- PLAN  the tests.
SELECT PLAN (4);

-- Run the tests.

SELECT HAS_TABLE_PRIVILEGE('my_schema.users', 'SELECT') AS privilege;

SELECT has_table('my_schema'::name,'users'::name) AS my_table_exists;

select HAS_FUNCTION('my_schema','log_product_changes','Check if function exists!');
select has_trigger('products','product_price_change','Check if trigger "product_price_change" exists');

-- Finish the tests and clean up.
SELECT * FROM FINISH();
ROLLBACK;
SQL

The preceding code snippet includes additional directives and settings specific to running test cases with pgTAP. Let’s go through each line and explain their purpose in detail.

Explanation of the code:

The provided script sets up and executes a pgTAP test case in PostgreSQL, focusing on configuring the test environment and controlling output. Directives like \unset ECHO and \set QUIET 1 are used to suppress the echoing of SQL commands and minimize unnecessary output, ensuring the test results remain clean and focused.

The \pset directives (\pset format unaligned, \pset tuples_only true, and \pset pager off) format the output to align with the TAP (Test Anything Protocol) format, removing tabular formatting and pagination. Additionally, error-handling settings such as \set ON_ERROR_ROLLBACK 1 and \set ON_ERROR_STOP true are configured to ensure that any errors during the test cause an immediate rollback of changes, leaving the database state unchanged.

The script begins a new transaction with BEGIN;, specifies the test plan with SELECT PLAN (4);, and runs test queries to check for table privileges, table existence, function existence, and trigger existence. These tests verify if the user has SELECT privileges on the my_schema.users table, if the table exists, if the log_product_changes function is defined, and if the product_price_change trigger is present. Once the tests are complete, the results are displayed using SELECT * FROM FINISH();, and the transaction is rolled back with ROLLBACK; to ensure no data is affected by the test. This setup ensures that the test environment is properly controlled, errors are handled, and the database remains unaffected by the test operations.

Now we will run the test.sql file with pg_prove as mention using above command
Run the following command, replacing [database name] and [hostname] with the name of your database and host name:

pg_prove -h host.rds.amazonaws.com -p 5432 -d databaseName -U postgres -v path/to/test/directory/*.sql  > results.txt
Bash

Explanation of the output:

output:

1..4
ok 1 - privilege
ok 2 - my_table_exists
ok 3 - Check if function exists!
ok 4 - Check if trigger "product_price_change" exists
All tests successful.
Bash

The output indicates that a total of 4 tests were executed, all of which passed successfully. The tests checked various aspects like, the first test confirmed that the user has the required SELECT privileges on the relevant table, the second test verified that the my_schema.users table exists, the third test ensured that the log_product_changes function is present, and the fourth test checked the existence of the product_price_change trigger. The message “All tests successful” at the end confirms that all the checks were completed without any issues, indicating that the database schema is correctly configured with the necessary privileges, table, function, and trigger.

2. Data integrity test

This test case verifies the integrity of the data in a table by checking specific conditions, ensuring that the table exists, and contains exactly three rows.

In this scenario, we use a command-line tool or client utility, such as PgAdmin or psql, to execute the test in the user interface.

BEGIN;

SELECT PLAN (2);

SELECT results_eq('SELECT COUNT(*)::int  AS no_duplicate_emails  FROM (SELECT email, COUNT(*) as count FROM users  GROUP BY email  HAVING COUNT(*) > 1) AS duplicates',ARRAY[0], 'no_duplicate_emails');
      
SELECT results_eq('SELECT COUNT(*)::int  AS users_table_has_3_rows FROM users; ',ARRAY[3],'row count matched');
            

-- Finish the tests and clean up.
SELECT * FROM FINISH();
ROLLBACK;
SQL

output:

1..2
ok 1 - no_duplicate_emails
ok 2 - row count matched
Bash

Here we are in first query checking duplication of email in our test table users and in the second query we matched expected count of table.

3. Function test

This test case verifies the correctness of a function’s output. The test calls the calculate_total_price function with parameters 5 and 10.99 and checks if the returned value is equal to 54.95:

\unset ECHO
\set QUIET 1
-- Turn off echo and keep things quiet.

-- Format the output for nice TAP.
\pset format unaligned
\pset tuples_only true
\pset pager off

-- Revert all changes on failure.
\set ON_ERROR_ROLLBACK 1
\set ON_ERROR_STOP true

BEGIN;

SELECT PLAN (1);

SELECT IS(CALCULATE_TOTAL_PRICE(5, 10.99), 54.95, 'correct_total_price');

-- Finish the tests and clean up.
SELECT * FROM FINISH();
ROLLBACK;
SQL

output:

1..1
ok 1 - correct_total_price
Bash

4. Trigger test

This test case checks if a trigger performs the expected actions when certain conditions are met. The test inserts a new row into the products table, invoking the product_price_change trigger. It then checks if the trigger was run by verifying if a corresponding row exists in the price_changes table for the inserted product.

Always use set session_replication_role = ‘origin’, by setting this, PostgreSQL will fire the triggers, else it won’t while running pg_tap test.

\unset ECHO
\set QUIET 1
-- Turn off echo and keep things quiet.

-- Format the output for nice TAP.
\pset format unaligned
\pset tuples_only true
\pset pager off

-- Revert all changes on failure.
\set ON_ERROR_ROLLBACK 1
\set ON_ERROR_STOP true

BEGIN;

SELECT PLAN (2);

set session_replication_role = 'origin';

-- update an old  product with a different price to trigger the price change trigger

update products  set price = 24.99 where fname = 'Widget A';

SELECT has_trigger('my_schema','products', 'product_price_change', 'trigger exists on product table and insertions on price_changes' );

   select results_eq('SELECT COUNT(*)::int   AS price_change_logged FROM price_changes WHERE product_id = (SELECT id FROM products WHERE fname= ''Widget A'') AND old_price = 9.99    AND new_price = 24.99;',ARRAY[1],'trigger executed for new insertion row');

 
-- Finish the tests and clean up.
SELECT * FROM FINISH();
ROLLBACK;
SQL

output:

1..2
ok 1 - trigger exists on product table and insertions on price_changes
ok 2 - trigger executed for new insertion row
Bash

List of Assertion functions

The following table lists a few more widely used assertion functions.

PGTAP Function Description
has_table and hasnt_table Checks whether the specified table exists.
has_view Checks whether the specified view exists.
has_materialized_view Checks whether the specified materialized view exists.
has_index Checks whether a table has the specified index.
has_relation Checks whether the specified relation exists, which can be a table, index, or sequence.
function_returns Checks whether a particular function returns a particular data type.
is_definer Checks whether a function is a security definer.
col_is_pk Checks whether a column is a primary key column of a table.
col_isnt_pk Checks whether a column is not a primary key of a table.
col_isnt_fk Checks whether a column is not a foreign key of a table.
has_column Checks whether a table contains a column.
hasnt_column Checks whether a table does not contain a column.
policy_cmd_is Checks whether the specified row-level security policy exists.
policy_roles_are Checks whether the row-level security policy applies to all users. The function returns TRUE if all users to which the row-level security policy applies are specified.
policies_are Checks whether a table contains a row-level security policy.

For a full list of all the assertion functions, refer to the pgTAP documentation.

Conclusion

In this post, we explored various unit testing options for user-defined functions in PostgreSQL and demonstrated how to create example tests using the pgTAP framework. By incorporating pgTAP, you can significantly enhance database reliability, stability, and security. pgTAP helps improve code quality while simplifying the testing process. With it, developers can write tests using psql scripts or xUnit-style functions, covering critical aspects like functions, triggers, and indexed columns.

For more information about using pgTAP, refer to the pgTAP documentation. If you have any comments or questions, please leave them in the comments section.


About Authors


Neha Sharma is a Delivery Consultant at AWS, working directly with customers to enable and accelerate their adoption of AI/ML services and onboarding to new AWS offerings. With a background in analytics and databases, she has previously supported customers in their cloud migration journeys, helping modernize and transition workloads from legacy systems to AWS.

Vishal Sanzira is a Database Engineer with Amazon Web Services. He has a passion for database and has been working with them for more than 7 years, mostly with PostgreSQL and migrating databases to the AWS Cloud.

Venkata Prasad Reddy Somala has over 10 years of experience in database management and nearly 2 years specializing in database migrations. He is fully committed in assisting and supporting customers and ensure a smooth and efficient transition for every client migrating their databases to the cloud.