AWS Database Blog

Index types supported in Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL using extensions (SP-GiST, Btree_Gin and Btree_Gist)

Are you working with spatial data, network addresses, or complex multi-column queries in your PostgreSQL databases and wondering which index type gives you the best performance? In Part 1 and Part 2 of this series, we explored PostgreSQL’s native indexes (B-tree, GIN, GiST, HASH, and BRIN). In this post, the third in the series, we dive into three extension-based index types: SP-GiST, btree_gin, and btree_gist. These are available in Amazon Aurora PostgreSQL-Compatible Edition and Amazon Relational Database Service (Amazon RDS) for PostgreSQL.

PostgreSQL’s index infrastructure is extensible. Operator classes define how indexes behave for specific data types and operations. The SP-GiST, btree_gin, and btree_gist extensions take advantage of this extensibility to give you additional indexing strategies beyond the native options. We walk through when to use each extension, the data types they support, and practical examples that demonstrate their performance benefits.

Prerequisites

We use SQL examples throughout this post to demonstrate concepts. If you want to reproduce the examples in your environment, you need the following:

  • An Amazon RDS for PostgreSQL or Aurora PostgreSQL database instance
  • A client machine with psql installed and a connection to the database

For more information, refer to Connecting to a DB instance running the PostgreSQL database engine.

Refer to the following script to generate the sample dataset. This script generates a sample dataset for an products_infor table with various columns containing different data types like character strings, integers, dates, arrays, and JSONB. It uses PostgreSQL functions like RANDOM(), REPEAT(), and GENERATE_SERIES() to generate random sample data that will used throughout this post.

CREATE TABLE products_infor(
    id SERIAL PRIMARY KEY,
    name TEXT,
    category TEXT,
    price NUMERIC(10,2),
    warehouse_location POINT,        -- Warehouse location coordinates
    device_ip INET,                  -- Network address for products digital interface/IoT devices
    temperature_range INT4RANGE,     -- Storage temperature range
    price_range NUMRANGE,            -- Price range history
    availability_period TSRANGE,     -- Time range for product availability
    delivery_zone POLYGON,           -- Coverage area for delivery/service
    description TEXT,                -- Detailed product description
    dimensions POINT,                -- Product dimensions (length, width, height)
    tags TEXT[],                     -- Product tags/categories (array column for btree_gin demo)
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

INSERT INTO products_infor (
    name,
    category,
    description,
    price,
    warehouse_location,
    device_ip,
    temperature_range,
    price_range,
    availability_period,
    delivery_zone,
    dimensions,
    tags,
    created_at
)
SELECT
    'Product_' || i AS name,
    (ARRAY['electronics', 'furniture', 'clothing', 'food', 'toys'])[1 + (i % 5)] AS category,
    CASE
        WHEN random() < 0.3 THEN 'Cardamom' || substr(REPEAT('abcdefghijklmnopqrstuvwxyz',40), ((random()*(30-1)+1)::integer), 10)
        WHEN random() < 0.3 THEN 'Clove' || substr(REPEAT('abcdefghijklmnopqrstuvwxyz',40), ((random()*(30-1)+1)::integer), 10)
        ELSE substr(REPEAT('abcdefghijklmnopqrstuvwxyz',40), ((random()*(30-1)+1)::integer), 15)
    END AS product_description,
    (random() * 990 + 10)::numeric(10,2) AS price,
    POINT(0 + (random() * 20), -12 + (random() * 40)) AS warehouse_location,
    ('10.' || (random() * 255)::int || '.' || (random() * 255)::int || '.' || (random() * 255)::int)::inet AS device_ip,
    int4range((random() * 30 - 20)::int, (random() * 30 + 10)::int) AS temperature_range,
    numrange((random() * 490 + 10)::numeric(10,2), (random() * 490 + 510)::numeric(10,2)) AS price_range,
    tsrange(current_date, current_date + (random() * interval '365 days')) AS availability_period,
    polygon(box(point(i*random() - random(), i*random() - random()), point(i*random() + random(), i*random() + random()))) AS delivery_zone,
    POINT((random() * 100)::int, (random() * 100)::int) AS dimensions,
    ARRAY(
        SELECT (ARRAY['wireless', 'bluetooth', 'premium', 'budget', 'eco-friendly',
                      'waterproof', 'portable', 'smart', 'organic', 'recycled'])[1 + (random()*9)::int]
        FROM generate_series(1, (random() * 3 + 1)::int)
    ) AS tags,
    CURRENT_TIMESTAMP AS created_at
FROM generate_series(1, 2000000) i;

SP-GiST

SP-GiST (Space-Partitioned Generalized Search Tree) is a specialized indexing framework in PostgreSQL that supports partitioned search trees. It is ideal for implementing non-balanced data structures like quadtrees, k-d trees, and radix trees. Its key feature is the ability to divide search spaces into partitions of varying sizes, enabling very fast searches when queries align well with the partitioning strategy.

The main challenge that SP-GiST addresses is efficiently mapping tree nodes to disk pages to minimize I/O operations during searches. In-memory tree structures often use chains of pointers that result in many levels and each level requires a separate disk read when stored on disk. SP-GiST solves this by maintaining a high fanout (many children per node), which keeps the tree shallow and reduces the number of disk pages accessed during a search.

The index structure consists of two main components:

  • Leaf tuples — Containing actual data values or their representations
  • Inner tuples — Serving as branching points with multiple nodes, each having downlinks to lower-level tuples

Let’s examine some use cases of when an SP-GiST index should be used. Suppose you need to find the nearest warehouse to a point.

EXPLAIN ANALYZE SELECT name, warehouse_location, warehouse_location <-> POINT(35, -100) as distance
FROM products_infor ORDER BY warehouse_location <-> POINT(35, -100) LIMIT 5;

QUERY PLAN
-----------------------------------------------------------------------------------
Limit (cost=114298.70..114299.27 rows=5 width=56) (actual time=6221.163..6223.497 rows=5.00 loops=1)
Buffers: shared hit=295 read=83078 dirtied=50043 written=34900
-> Gather Merge (cost=114298.70..313746.55 rows=1749994 width=56) (actual time=6221.162..6223.495 rows=5.00 loops=1)
Workers Planned: 1
Workers Launched: 1
Buffers: shared hit=295 read=83078 dirtied=50043 written=34900
-> Sort (cost=113298.69..115872.21 rows=1029408 width=56) (actual time=6193.218..6193.219 rows=5.00 loops=2)
Sort Key: ((warehouse_location <-> '(35,-100)'::point))
Sort Method: top-N heapsort Memory: 25kB
Buffers: shared hit=295 read=83078 dirtied=50043 written=34900
Worker 0: Sort Method: top-N heapsort Memory: 25kB
-> Parallel Seq Scan on products_infor (cost=0.00..96200.60 rows=1029408 width=56) (actual time=0.682..6119.520 rows=1000000.00 loops=2)
Buffers: shared hit=256 read=83077 dirtied=50043 written=34900
Planning:
Buffers: shared hit=10 read=9
Planning Time: 8.583 ms
Execution Time: 6224.052 ms

As shown in the preceding output, this query performs a table scan and takes about 6 seconds to run on the test dataset. Let’s create an SP-GiST index on warehouse_location:

CREATE INDEX idx_prod_warehouse_kd ON products_infor USING spgist(warehouse_location kd_point_ops);
EXPLAIN ANALYZE SELECT name, warehouse_location, warehouse_location <-> POINT(35, -100) as distance
FROM products_infor ORDER BY warehouse_location <-> POINT(35, -100) LIMIT 5;
QUERY PLAN
----------------------------------------------------------------------------------
Limit (cost=0.41..1.48 rows=5 width=56) (actual time=6.246..6.608 rows=5.00 loops=1)
Buffers: shared hit=320 read=5
-> Index Scan using idx_prod_warehouse_kd on products_infor (cost=0.41..425212.41 rows=2000000 width=56) (actual time=6.245..6.605 rows=5.00 loops=1)
Order By: (warehouse_location <-> '(35,-100)'::point)
Index Searches: 1
Buffers: shared hit=320 read=5
Planning:
Buffers: shared hit=14 read=5 dirtied=1
Planning Time: 2.862 ms
Execution Time: 6.669 ms

Let’s see another example: find products with descriptions starting with “Cardamom”:

EXPLAIN ANALYZE SELECT name, description FROM products_infor WHERE description ^@ 'Cardamom sp';
QUERY PLAN
-----------------------------------------------------------------------------------
Gather (cost=1000.00..102663.06 rows=1 width=31) (actual time=287.524..288.734 rows=0.00 loops=1)
Workers Planned: 1
Workers Launched: 1
Buffers: shared hit=8189 read=78768
-> Parallel Seq Scan on products_infor (cost=0.00..101662.96 rows=1 width=31) (actual time=265.569..265.569 rows=0.00 loops=2)
Filter: (description ^@ 'Cardamom sp'::text)
Rows Removed by Filter: 1000000
Buffers: shared hit=8189 read=78768
Planning:
Buffers: shared hit=5
Planning Time: 0.769 ms
Execution Time: 290.296 ms
CREATE INDEX idx_prod_warehouse_txt ON products_infor USING spgist(description text_ops);
EXPLAIN ANALYZE SELECT name, description FROM products_infor WHERE description ^@ 'Cardamom sp';
QUERY PLAN
-----------------------------------------------------------------------------------
Index Scan using idx_prod_warehouse_txt on products_infor (cost=0.29..8.30 rows=1 width=31) (actual time=0.032..0.033 rows=0.00 loops=1)
Index Cond: (description ^@ 'Cardamom sp'::text)
Index Searches: 1
Buffers: shared hit=2
Planning:
Buffers: shared hit=16
Planning Time: 11.211 ms
Execution Time: 0.061 ms

SP-GiST only supports limited data types and operators and can’t be used to improve ORDER BY performance. SP-GiST indexes have a higher performance penalty on write operations (inserts and updates) to keep the index updated, so they’re recommended for columns whose values are static or rarely change.

You can reduce the write performance penalty by separating SP-GiST indexed columns into a separate table from frequently updated columns. For example, consider a shipments table with columns tracking_id, description (used for prefix searches), status, and last_location. If you create an SP-GiST index on description for prefix searches, but status and last_location are updated frequently (such as every time a package moves), every update triggers an index tuple update even though description hasn’t changed. Moving description to a separate table like shipment_details eliminates this penalty.

btree_gin

The btree_gin extension adds the capability to use GIN indexes with data types that are normally indexed using B-tree indexes. This becomes particularly valuable when you need to create a single GIN index that combines array or composite columns (which require GIN indexing) with scalar columns (which normally use B-tree indexes). For example, in the products_infor table, you might need to filter on both the tags array column and scalar columns like category in a single query.

PostgreSQL’s query planner can efficiently combine multiple B-tree indexes using bitmap scans when running queries with multiple filter conditions. However, when your query needs to filter on both an array column (requiring GIN) and a scalar column (using B-tree), the planner must combine these different index types using a BitmapAnd operation, which requires scanning two separate indexes and merging the results.

This is where btree_gin provides a solution. It allows you to create a unified GIN index that spans both array and scalar columns, so PostgreSQL can evaluate all conditions in a single index scan without the overhead of combining separate indexes.

The following example shows how PostgreSQL handles separate B-tree and GIN indexes using a BitmapAnd to combine their results, and then demonstrates how a unified btree_gin index eliminates this merge step.

Create separate indexes (the traditional approach):

CREATE INDEX idx_tags_gin ON products_infor USING gin(tags);
CREATE INDEX idx_category_btree ON products_infor USING btree(category);

EXPLAIN ANALYZE
SELECT name, category, tags FROM products_infor
WHERE tags @> ARRAY['wireless', 'premium']
AND category = 'electronics';

Next, run this query that filters on both array and scalar columns.

QUERY PLAN
-----------------------------------------------------------------------------------
Bitmap Heap Scan on products_infor (cost=5197.49..12421.52 rows=2047 width=95) (actual time=56.820..56.820 rows=0.00 loops=1)
Recheck Cond: ((tags @> '{wireless}'::text[]) AND (category = 'electronics'::text))
Buffers: shared hit=190
-> BitmapAnd (cost=5197.49..5197.49 rows=2047 width=0) (actual time=56.812..56.813 rows=0.00 loops=1)
Buffers: shared hit=190
-> Bitmap Index Scan on idx_tags_gin (cost=0.00..880.74 rows=10455 width=0) (actual time=56.812..56.812 rows=0.00 loops=1)
Index Cond: (tags @> '{wireless}'::text[])
Index Searches: 1
Buffers: shared hit=190
-> Bitmap Index Scan on idx_category_btree (cost=0.00..4315.48 rows=409474 width=0) (never executed)
Index Cond: (category = 'electronics'::text)
Index Searches: 0
Planning:
Buffers: shared hit=3
Planning Time: 0.797 ms
Execution Time: 56.852 ms

Notice that the query plan uses the B-tree and GIN indexes (idx_category_btree and idx_tags_gin) with a BitmapAnd operation. The tags filter appears as a post-scan Filter rather than an Index Cond, meaning the GIN index on tags isn’t being combined with the B-tree indexes.

The btree_gin solution

With the btree_gin extension, you can create a unified GIN index that includes both array and scalar columns. PostgreSQL can then use a single index scan for all filter conditions.

DROP INDEX IF EXISTS idx_tags_gin;
DROP INDEX IF EXISTS idx_category_btree;

Create a unified GIN index using btree_gin

CREATE EXTENSION btree_gin;
CREATE INDEX idx_unified_gin ON products_infor USING gin(tags,category);

Run the same query again

EXPLAIN ANALYZE
SELECT name, category, tags FROM products_infor
WHERE tags @> ARRAY[‘wireless’, ‘premium’]
AND category = ‘electronics’;
QUERY PLAN
----------------------------------------------------------------------------------
Bitmap Heap Scan on products_infor (cost=53.71..7237.60 rows=2035 width=104) (actual time=8.344..8.345 rows=0.00 loops=1)
Recheck Cond: ((tags @> '{wireless,premium}'::text[]) AND (category = 'electronics'::text))
Buffers: shared hit=14
-> Bitmap Index Scan on idx_unified_gin (cost=0.00..53.20 rows=2035 width=0) (actual time=8.323..8.323 rows=0.00 loops=1)
Index Cond: ((tags @> '{wireless,premium}'::text[]) AND (category = 'electronics'::text))
Index Searches: 1
Buffers: shared hit=14
Planning:
Buffers: shared hit=1
Planning Time: 0.333 ms
Execution Time: 8.399 ms

The btree_gin advantage

With the unified GIN index, notice the difference in the execution plan:

  1. PostgreSQL uses a single Bitmap Index Scan on idx_unified_gin.
  2. All filter conditions are evaluated within the index: the array condition (tags @> '{wireless, premium}') and the equality condition (category = 'electronics').
  3. No separate bitmap operations are needed to combine different index types.

The value of btree_gin is in unified indexing capabilities when you need to filter on both array and scalar columns together, particularly for queries with higher selectivity or when avoiding the complexity of managing separate index types.

When to use btree_gin

The btree_gin extension is most valuable when:

  • You have queries that filter on both array or composite columns and scalar columns together.
  • The array column requires GIN-specific operators like @> (contains), && (overlaps), or <@ (contained by).
  • You want to give the optimizer the option to use a single unified index instead of combining separate GIN and B-tree indexes.

Without btree_gin, the PostgreSQL optimizer is limited to either:

  • Using the GIN index on the array column and applying scalar filters as a post-scan operation.
  • Using the B-tree indexes on scalar columns and applying the array filter as a post-scan operation.
  • Combining both with a BitmapAnd operation, which requires scanning two separate indexes and merging the resulting bitmaps.

When you create a btree_gin index that spans both array and scalar columns, the optimizer can use a single index scan to evaluate all filter conditions together.

Important performance considerations

Although btree_gin enables powerful multi-column indexing, it comes with significant trade-offs related to GIN index maintenance:

GIN update overhead: GIN indexes are substantially slower to update compared to B-tree indexes. Every INSERT, UPDATE, or DELETE operation that modifies indexed columns requires maintaining the GIN index. This involves either appending to a pending list (fastupdate by default) or directly updating the posting tree structure. Both are more complex and resource-intensive than B-tree updates.

Pending list flush overhead: On tables with frequent writes, GIN indexes accumulate updates in a pending list (controlled by gin_pending_list_limit). When this list is flushed to the main index (triggered by the limit being reached or during VACUUM) it can cause buffer contention and temporary performance degradation. Tuning gin_pending_list_limit controls the batch size but doesn’t eliminate the underlying cost.

Alternative for high-update tables: If your table experiences high update volumes, consider using btree_gist instead (covered in the next section). GiST indexes generally handle updates more efficiently than GIN indexes, though they may have different query performance characteristics.

btree_gist

The btree_gist extension adds the capability to use GiST indexes with data types that are normally indexed using B-tree indexes. This is particularly valuable in two scenarios: when you need to create a single GiST index that combines range-type columns (which require GiST) with scalar columns (which normally use B-tree indexes), or when you need to implement exclusion constraints on scalar columns (which require GiST operator classes that only btree_gist provides).

PostgreSQL’s query planner can efficiently combine multiple B-tree indexes using bitmap scans when running queries with multiple filter conditions. However, when your query filters on both range-type columns (requiring GiST for overlap operators like &&) and scalar columns (using B-tree), the planner must scan each index separately and combine the results using a BitmapAnd operation. This adds overhead compared to a single multicolumn index that handles both conditions in one lookup.

Example

Let’s first see how PostgreSQL handles separate indexes that can’t be combined effectively when dealing with range overlap queries.

First, create separate indexes (the traditional approach). Note: We already have idx_unified_gin from the btree_gin section, so let’s drop it first.

DROP INDEX IF EXISTS idx_unified_gin;

Create separate indexes

CREATE INDEX idx_category_btree ON products_infor USING btree(category);
CREATE INDEX idx_price_range_gist ON products_infor USING gist(price_range);
CREATE INDEX idx_availability_gist ON products_infor USING gist(availability_period);

Now run this query that filters on both scalar and range columns with overlap operators

EXPLAIN ANALYZE
SELECT name, category, price_range, availability_period FROM products_infor
WHERE category = 'electronics'
AND price_range && numrange(100.00, 150.00)
AND availability_period && tsrange('2026-06-01 00:00:00', '2026-06-15 00:00:00');

QUERY PLAN
-----------------------------------------------------------------------------------
Bitmap Heap Scan on products_infor (cost=4477.45..120339.58 rows=112098 width=64) (actual time=27.690..4219.818 rows=110768.00 loops=1)
Recheck Cond: (category = 'electronics'::text)
Rows Removed by Index Recheck: 581194
Filter: ((price_range && '[100.00,150.00)'::numrange) AND (availability_period && '["2026-06-01 00:00:00","2026-06-15 00:00:00")'::tsrange))
Rows Removed by Filter: 289232
Heap Blocks: exact=57885 lossy=33025
Buffers: shared read=91255 written=3
-> Bitmap Index Scan on idx_category_btree (cost=0.00..4449.43 rows=407067 width=0) (actual time=19.384..19.385 rows=400000.00 loops=1)
Index Cond: (category = 'electronics'::text)
Index Searches: 1
Buffers: shared read=345
Planning:
Buffers: shared hit=38 read=1
Planning Time: 1.661 ms
Execution Time: 4237.050 ms

Even with separate GiST indexes on price_range and availability_period, the planner chooses to use only the B-tree index on category and applies the range overlap conditions as post-scan heap filters.

Notice that the GiST indexes on price_range and availability_period are completely ignored. The planner scans 400,000 rows using the B-tree index on category, then discards 289,232 rows by applying the range conditions as heap filters. This results in significant wasted I/O—reading rows from the heap only to discard them.

The core issue is that the planner’s cost model determines it is cheaper to filter in the heap than to perform a BitmapAnd across three separate indexes of different types (B-tree plus GiST). This leaves the range overlap conditions unaided by any index.

The btree_gist solution

The btree_gist extension solves this problem by allowing you to create a unified GiST index that includes both range-type and scalar columns. PostgreSQL can then evaluate all filter conditions within a single index scan operation.

Drop the separate indexes

DROP INDEX IF EXISTS idx_category_btree;
DROP INDEX IF EXISTS idx_price_range_gist;
DROP INDEX IF EXISTS idx_availability_gist;

First, enable the btree_gist extension

CREATE EXTENSION IF NOT EXISTS btree_gist;

Create a unified GiST index using btree_gist

CREATE INDEX idx_unified_gist ON products_infor USING gist(category, price_range, availability_period);

Run the same query again

EXPLAIN ANALYZE
SELECT name, category, price_range, availability_period FROM products_infor
WHERE category = 'electronics'
AND price_range && numrange(100.00, 150.00)
AND availability_period && tsrange('2026-06-01 00:00:00', '2026-06-15 00:00:00');

QUERY PLAN
-----------------------------------------------------------------------------------
Bitmap Heap Scan on products_infor (cost=7397.66..122472.84 rows=112098 width=64) (actual time=145.562..793.022 rows=110768.00 loops=1)
Recheck Cond: ((category = 'electronics'::text) AND (price_range && '[100.00,150.00)'::numrange) AND (availability_period && '["2026-06-01 00:00:00","2026-06-15 00:00:00")'::tsrange))
Rows Removed by Index Recheck: 674914
Heap Blocks: exact=35554 lossy=33140
Buffers: shared read=70385 written=11
-> Bitmap Index Scan on idx_unified_gist (cost=0.00..7369.64 rows=112098 width=0) (actual time=141.741..141.741 rows=110768.00 loops=1)
Index Cond: ((category = 'electronics'::text) AND (price_range && '[100.00,150.00)'::numrange) AND (availability_period && '["2026-06-01 00:00:00","2026-06-15 00:00:00")'::tsrange))
Index Searches: 1
Buffers: shared read=1691
Planning:
Buffers: shared hit=21 read=1 dirtied=1
Planning Time: 1.637 ms
Execution Time: 795.901 ms

The unified GiST index (793 ms) was significantly faster than the separate-index approach (4,237 ms).

Because of query selectivity (when queries return a significant percentage of table rows), you might notice that the unified GiST index performs slower than a sequential scan in some cases. The value of btree_gist isn’t solely in raw query speed. It’s in the capabilities that weren’t possible before, particularly exclusion constraints and unified indexing across different data types.

Exclusion constraints

btree_gist enables exclusion constraints that span both scalar and range-type columns. Exclusion constraints prevent overlapping values across multiple columns, enforcing complex business rules at the database level.

For example, consider the rule: “No two bookings for the same room can have overlapping time ranges.” Enforcing this requires both the room_id (scalar) and the time range (range type) in the same index. B-tree can’t perform range overlap checks, and GiST can’t natively handle scalar equality. btree_gist bridges that gap, making such constraints possible.

In the following example, we apply this concept to product pricing—making sure that a product can’t have overlapping price ranges during overlapping time periods.

Create a new table to demonstrate exclusion constraints

CREATE TABLE product_pricing (
    product_id INTEGER,
    product_name TEXT,
    price_range NUMRANGE,
    effective_period TSRANGE,
    -- Exclusion constraint: prevent overlapping price ranges for the same product during overlapping time periods
    EXCLUDE USING gist (
        product_id WITH =,
        price_range WITH &&,
        effective_period WITH &&
    )
);

Try to insert valid non-overlapping data (should succeed)

INSERT INTO product_pricing VALUES
(1, 'Laptop', numrange(800, 1200), tsrange('2025-01-01', '2025-06-30')),
(1, 'Laptop', numrange(750, 1100), tsrange('2025-07-01', '2025-12-31')),
(2, 'Monitor', numrange(200, 400), tsrange('2025-01-01', '2025-12-31'));
/*
INSERT 0 3
Query returned successfully in 30 msec.
*/

Verify the data was inserted

SELECT * FROM product_pricing ORDER BY product_id, effective_period;
/*
"product_id" "product_name" "price_range" "effective_period"
1 "Laptop" "[800,1200)" "[""2025-01-01 00:00:00"",""2025-06-30 00:00:00"")"
1 "Laptop" "[750,1100)" "[""2025-07-01 00:00:00"",""2025-12-31 00:00:00"")"
2 "Monitor" "[200,400)" "[""2025-01-01 00:00:00"",""2025-12-31 00:00:00"")"
*/

Now try to insert overlapping data (this should fail with constraint violation)

INSERT INTO product_pricing VALUES
(1, 'Laptop', numrange(900, 1300), tsrange('2025-05-01', '2025-08-31'));
/*
ERROR: conflicting key value violates exclusion constraint "product_pricing_product_id_price_range_effective_period_excl"
Key (product_id, price_range, effective_period)=(1, [900,1300), ["2025-05-01 00:00:00","2025-08-31 00:00:00")) conflicts with existing key (product_id, price_range, effective_period)=(1, [800,1200), ["2025-01-01 00:00:00","2025-06-30 00:00:00")).
SQL state: 23P01
Detail: Key (product_id, price_range, effective_period)=(1, [900,1300), ["2025-05-01 00:00:00","2025-08-31 00:00:00")) conflicts with existing key (product_id, price_range, effective_period)=(1, [800,1200), ["2025-01-01 00:00:00","2025-06-30 00:00:00")).
*/

The exclusion constraint successfully prevented the insertion of overlapping data. Here’s what happened:

Successful inserts:

  • Product 1 (Laptop): Price range [800,1200) during Jan-Jun 2025.
  • Product 1 (Laptop): Price range [750,1100) during Jul-Dec 2025.
  • Product 2 (Monitor): Price range [200,400) during Jan-Dec 2025.

These inserts succeeded because:

  • The two Laptop entries have non-overlapping time periods (June 30 ends, July 1 begins).
  • The Monitor entry is for a different product_id.

Failed insert:

ERROR: conflicting key value violates exclusion constraint
Key (product_id, price_range, effective_period)=(1, [900,1300), ["2025-05-01","2025-08-31"))
conflicts with existing key (product_id, price_range, effective_period)=(1, [800,1200), ["2025-01-01","2025-06-30"))

This insert failed because:

  • Same product_id (1).
  • Overlapping price ranges: [900,1300) overlaps with [800,1200).
  • Overlapping time periods: May-Aug 2025 overlaps with Jan-Jun 2025.

The exclusion constraint enforces the business rule: “A product cannot have overlapping price ranges during overlapping time periods.” This helps prevent data integrity violations at the database level, reducing the need for application-level validation.

Performance considerations: GiST compared to GIN

As mentioned in the btree_gin section, GIN indexes have significant update overhead and can cause buffer contention issues during VACUUM operations on high-update tables. The btree_gist extension provides an important alternative.

When to choose btree_gist over btree_gin:

  • High-update workloads: GiST indexes handle updates more efficiently than GIN indexes.
  • Exclusion constraints: This is a unique capability only available with GiST indexes.
  • Range queries: GiST is optimized for range overlap operations.

When to choose btree_gin over btree_gist:

  • Array containment queries: GIN is more efficient for array operators like @> (contains).
  • Full-text search: GIN is the preferred index type for text search operations.
  • Read-heavy workloads: GIN can provide better query performance when updates are infrequent.

Considerations when using btree_gist

Although btree_gist enables powerful capabilities, there are important trade-offs to consider. The extension is primarily valuable for exclusion constraints and unified indexing across scalar and range-type columns, rather than for raw query performance improvements.

Query performance with btree_gist indexes depends heavily on query selectivity. When queries return a significant percentage of table rows, PostgreSQL may prefer sequential scans over index scans, regardless of the index type. This is expected behavior. The query planner chooses the most efficient execution strategy based on cost estimates.

From a maintenance perspective, although GiST indexes handle updates more efficiently than GIN indexes, they still require more overhead than B-tree indexes for write operations. The index structure must be updated for every INSERT, UPDATE, or DELETE operation affecting indexed columns.

Finally, btree_gist has operator limitations. The extension supports equality (=) and comparison operators (<, <=, >, >=) for scalar types, but doesn’t support all operators available with native B-tree indexes. For example, pattern matching operators like LIKE or ILIKE are not supported within GiST indexes.

Before implementing these indexes in production, test with your actual workload, not just individual queries that benefit from the index. An index that improves one query may degrade overall performance because of write overhead across your entire workload.

Consider using Amazon Aurora Fast Database Cloning to create test environments where you can evaluate the impact of different indexing strategies on overall performance without affecting production workloads.

Conclusion

In this post, we discussed specialized indexing techniques supported in Aurora PostgreSQL and Amazon RDS for PostgreSQL through extensions. SP-GiST indexes enable efficient spatial and partitioned search operations, btree_gin allows unified GIN indexing across array and scalar columns for read-heavy workloads, and btree_gist enables exclusion constraints and unified GiST indexing for high-update workloads with range overlap operations.

These extension-based indexes complement PostgreSQL’s native indexing capabilities by addressing specific use cases that standard B-tree, GIN, and GiST indexes don’t handle efficiently. When you implement these indexes, analyze the performance impact on your overall workload and test thoroughly before production deployment. You can use Aurora Fast Clone to create test environments for evaluating index strategies without affecting production systems.

In Part 4 of this series, we discuss specialized text search indexes supported through database extensions like Bloom, pg_trgm, and pg_bigm. To learn more, see the Amazon Aurora PostgreSQL documentation and the Amazon RDS for PostgreSQL documentation.


About the authors

Sachin Khanna

Sachin Khanna

Sachin is a Lead Consultant specializing in Artificial Intelligence and Machine Learning (AI/ML) within the AWS Professional Services team. With a strong background in data management, generative AI, large language models, and machine learning, he brings extensive expertise to projects involving data, databases, and AI-driven solutions. His proficiency in cloud migration and cost optimization has enabled him to guide customers through successful cloud adoption journeys, delivering tailored solutions and strategic insights.

Rajkumar Raghuwanshi

Rajkumar Raghuwanshi

Rajkumar is a Delivery Consultant with AWS Professional Services based out of Pune, India. With extensive knowledge of relational databases and hands-on experience in homogenous and heterogenous database migrations, he helps customers migrate to the AWS Cloud, code conversion and optimize performance.