AWS Database Blog
Index types supported in Amazon Aurora PostgreSQL and Amazon RDS for PostgreSQL using extensions (SP-GiST, Btree_Gin and Btree_Gist)
Are you working with spatial data, network addresses, or complex multi-column queries in your PostgreSQL databases and wondering which index type gives you the best performance? In Part 1 and Part 2 of this series, we explored PostgreSQL’s native indexes (B-tree, GIN, GiST, HASH, and BRIN). In this post, the third in the series, we dive into three extension-based index types: SP-GiST, btree_gin, and btree_gist. These are available in Amazon Aurora PostgreSQL-Compatible Edition and Amazon Relational Database Service (Amazon RDS) for PostgreSQL.
PostgreSQL’s index infrastructure is extensible. Operator classes define how indexes behave for specific data types and operations. The SP-GiST, btree_gin, and btree_gist extensions take advantage of this extensibility to give you additional indexing strategies beyond the native options. We walk through when to use each extension, the data types they support, and practical examples that demonstrate their performance benefits.
Prerequisites
We use SQL examples throughout this post to demonstrate concepts. If you want to reproduce the examples in your environment, you need the following:
- An Amazon RDS for PostgreSQL or Aurora PostgreSQL database instance
- A client machine with psql installed and a connection to the database
For more information, refer to Connecting to a DB instance running the PostgreSQL database engine.
Refer to the following script to generate the sample dataset. This script generates a sample dataset for an products_infor table with various columns containing different data types like character strings, integers, dates, arrays, and JSONB. It uses PostgreSQL functions like RANDOM(), REPEAT(), and GENERATE_SERIES() to generate random sample data that will used throughout this post.
SP-GiST
SP-GiST (Space-Partitioned Generalized Search Tree) is a specialized indexing framework in PostgreSQL that supports partitioned search trees. It is ideal for implementing non-balanced data structures like quadtrees, k-d trees, and radix trees. Its key feature is the ability to divide search spaces into partitions of varying sizes, enabling very fast searches when queries align well with the partitioning strategy.
The main challenge that SP-GiST addresses is efficiently mapping tree nodes to disk pages to minimize I/O operations during searches. In-memory tree structures often use chains of pointers that result in many levels and each level requires a separate disk read when stored on disk. SP-GiST solves this by maintaining a high fanout (many children per node), which keeps the tree shallow and reduces the number of disk pages accessed during a search.
The index structure consists of two main components:
- Leaf tuples — Containing actual data values or their representations
- Inner tuples — Serving as branching points with multiple nodes, each having downlinks to lower-level tuples
Let’s examine some use cases of when an SP-GiST index should be used. Suppose you need to find the nearest warehouse to a point.
As shown in the preceding output, this query performs a table scan and takes about 6 seconds to run on the test dataset. Let’s create an SP-GiST index on warehouse_location:
Let’s see another example: find products with descriptions starting with “Cardamom”:
SP-GiST only supports limited data types and operators and can’t be used to improve ORDER BY performance. SP-GiST indexes have a higher performance penalty on write operations (inserts and updates) to keep the index updated, so they’re recommended for columns whose values are static or rarely change.
You can reduce the write performance penalty by separating SP-GiST indexed columns into a separate table from frequently updated columns. For example, consider a shipments table with columns tracking_id, description (used for prefix searches), status, and last_location. If you create an SP-GiST index on description for prefix searches, but status and last_location are updated frequently (such as every time a package moves), every update triggers an index tuple update even though description hasn’t changed. Moving description to a separate table like shipment_details eliminates this penalty.
btree_gin
The btree_gin extension adds the capability to use GIN indexes with data types that are normally indexed using B-tree indexes. This becomes particularly valuable when you need to create a single GIN index that combines array or composite columns (which require GIN indexing) with scalar columns (which normally use B-tree indexes). For example, in the products_infor table, you might need to filter on both the tags array column and scalar columns like category in a single query.
PostgreSQL’s query planner can efficiently combine multiple B-tree indexes using bitmap scans when running queries with multiple filter conditions. However, when your query needs to filter on both an array column (requiring GIN) and a scalar column (using B-tree), the planner must combine these different index types using a BitmapAnd operation, which requires scanning two separate indexes and merging the results.
This is where btree_gin provides a solution. It allows you to create a unified GIN index that spans both array and scalar columns, so PostgreSQL can evaluate all conditions in a single index scan without the overhead of combining separate indexes.
The following example shows how PostgreSQL handles separate B-tree and GIN indexes using a BitmapAnd to combine their results, and then demonstrates how a unified btree_gin index eliminates this merge step.
Create separate indexes (the traditional approach):
Next, run this query that filters on both array and scalar columns.
Notice that the query plan uses the B-tree and GIN indexes (idx_category_btree and idx_tags_gin) with a BitmapAnd operation. The tags filter appears as a post-scan Filter rather than an Index Cond, meaning the GIN index on tags isn’t being combined with the B-tree indexes.
The btree_gin solution
With the btree_gin extension, you can create a unified GIN index that includes both array and scalar columns. PostgreSQL can then use a single index scan for all filter conditions.
Create a unified GIN index using btree_gin
Run the same query again
The btree_gin advantage
With the unified GIN index, notice the difference in the execution plan:
- PostgreSQL uses a single Bitmap Index Scan on
idx_unified_gin. - All filter conditions are evaluated within the index: the array condition (
tags @> '{wireless, premium}') and the equality condition (category = 'electronics'). - No separate bitmap operations are needed to combine different index types.
The value of btree_gin is in unified indexing capabilities when you need to filter on both array and scalar columns together, particularly for queries with higher selectivity or when avoiding the complexity of managing separate index types.
When to use btree_gin
The btree_gin extension is most valuable when:
- You have queries that filter on both array or composite columns and scalar columns together.
- The array column requires GIN-specific operators like
@>(contains),&&(overlaps), or<@(contained by). - You want to give the optimizer the option to use a single unified index instead of combining separate GIN and B-tree indexes.
Without btree_gin, the PostgreSQL optimizer is limited to either:
- Using the GIN index on the array column and applying scalar filters as a post-scan operation.
- Using the B-tree indexes on scalar columns and applying the array filter as a post-scan operation.
- Combining both with a
BitmapAndoperation, which requires scanning two separate indexes and merging the resulting bitmaps.
When you create a btree_gin index that spans both array and scalar columns, the optimizer can use a single index scan to evaluate all filter conditions together.
Important performance considerations
Although btree_gin enables powerful multi-column indexing, it comes with significant trade-offs related to GIN index maintenance:
GIN update overhead: GIN indexes are substantially slower to update compared to B-tree indexes. Every INSERT, UPDATE, or DELETE operation that modifies indexed columns requires maintaining the GIN index. This involves either appending to a pending list (fastupdate by default) or directly updating the posting tree structure. Both are more complex and resource-intensive than B-tree updates.
Pending list flush overhead: On tables with frequent writes, GIN indexes accumulate updates in a pending list (controlled by gin_pending_list_limit). When this list is flushed to the main index (triggered by the limit being reached or during VACUUM) it can cause buffer contention and temporary performance degradation. Tuning gin_pending_list_limit controls the batch size but doesn’t eliminate the underlying cost.
Alternative for high-update tables: If your table experiences high update volumes, consider using btree_gist instead (covered in the next section). GiST indexes generally handle updates more efficiently than GIN indexes, though they may have different query performance characteristics.
btree_gist
The btree_gist extension adds the capability to use GiST indexes with data types that are normally indexed using B-tree indexes. This is particularly valuable in two scenarios: when you need to create a single GiST index that combines range-type columns (which require GiST) with scalar columns (which normally use B-tree indexes), or when you need to implement exclusion constraints on scalar columns (which require GiST operator classes that only btree_gist provides).
PostgreSQL’s query planner can efficiently combine multiple B-tree indexes using bitmap scans when running queries with multiple filter conditions. However, when your query filters on both range-type columns (requiring GiST for overlap operators like &&) and scalar columns (using B-tree), the planner must scan each index separately and combine the results using a BitmapAnd operation. This adds overhead compared to a single multicolumn index that handles both conditions in one lookup.
Example
Let’s first see how PostgreSQL handles separate indexes that can’t be combined effectively when dealing with range overlap queries.
First, create separate indexes (the traditional approach). Note: We already have idx_unified_gin from the btree_gin section, so let’s drop it first.
Create separate indexes
Now run this query that filters on both scalar and range columns with overlap operators
Even with separate GiST indexes on price_range and availability_period, the planner chooses to use only the B-tree index on category and applies the range overlap conditions as post-scan heap filters.
Notice that the GiST indexes on price_range and availability_period are completely ignored. The planner scans 400,000 rows using the B-tree index on category, then discards 289,232 rows by applying the range conditions as heap filters. This results in significant wasted I/O—reading rows from the heap only to discard them.
The core issue is that the planner’s cost model determines it is cheaper to filter in the heap than to perform a BitmapAnd across three separate indexes of different types (B-tree plus GiST). This leaves the range overlap conditions unaided by any index.
The btree_gist solution
The btree_gist extension solves this problem by allowing you to create a unified GiST index that includes both range-type and scalar columns. PostgreSQL can then evaluate all filter conditions within a single index scan operation.
Drop the separate indexes
First, enable the btree_gist extension
Create a unified GiST index using btree_gist
Run the same query again
The unified GiST index (793 ms) was significantly faster than the separate-index approach (4,237 ms).
Because of query selectivity (when queries return a significant percentage of table rows), you might notice that the unified GiST index performs slower than a sequential scan in some cases. The value of btree_gist isn’t solely in raw query speed. It’s in the capabilities that weren’t possible before, particularly exclusion constraints and unified indexing across different data types.
Exclusion constraints
btree_gist enables exclusion constraints that span both scalar and range-type columns. Exclusion constraints prevent overlapping values across multiple columns, enforcing complex business rules at the database level.
For example, consider the rule: “No two bookings for the same room can have overlapping time ranges.” Enforcing this requires both the room_id (scalar) and the time range (range type) in the same index. B-tree can’t perform range overlap checks, and GiST can’t natively handle scalar equality. btree_gist bridges that gap, making such constraints possible.
In the following example, we apply this concept to product pricing—making sure that a product can’t have overlapping price ranges during overlapping time periods.
Create a new table to demonstrate exclusion constraints
Try to insert valid non-overlapping data (should succeed)
Verify the data was inserted
Now try to insert overlapping data (this should fail with constraint violation)
The exclusion constraint successfully prevented the insertion of overlapping data. Here’s what happened:
Successful inserts:
- Product 1 (Laptop): Price range [800,1200) during Jan-Jun 2025.
- Product 1 (Laptop): Price range [750,1100) during Jul-Dec 2025.
- Product 2 (Monitor): Price range [200,400) during Jan-Dec 2025.
These inserts succeeded because:
- The two Laptop entries have non-overlapping time periods (June 30 ends, July 1 begins).
- The Monitor entry is for a different
product_id.
Failed insert:
This insert failed because:
- Same
product_id(1). - Overlapping price ranges: [900,1300) overlaps with [800,1200).
- Overlapping time periods: May-Aug 2025 overlaps with Jan-Jun 2025.
The exclusion constraint enforces the business rule: “A product cannot have overlapping price ranges during overlapping time periods.” This helps prevent data integrity violations at the database level, reducing the need for application-level validation.
Performance considerations: GiST compared to GIN
As mentioned in the btree_gin section, GIN indexes have significant update overhead and can cause buffer contention issues during VACUUM operations on high-update tables. The btree_gist extension provides an important alternative.
When to choose btree_gist over btree_gin:
- High-update workloads: GiST indexes handle updates more efficiently than GIN indexes.
- Exclusion constraints: This is a unique capability only available with GiST indexes.
- Range queries: GiST is optimized for range overlap operations.
When to choose btree_gin over btree_gist:
- Array containment queries: GIN is more efficient for array operators like
@>(contains). - Full-text search: GIN is the preferred index type for text search operations.
- Read-heavy workloads: GIN can provide better query performance when updates are infrequent.
Considerations when using btree_gist
Although btree_gist enables powerful capabilities, there are important trade-offs to consider. The extension is primarily valuable for exclusion constraints and unified indexing across scalar and range-type columns, rather than for raw query performance improvements.
Query performance with btree_gist indexes depends heavily on query selectivity. When queries return a significant percentage of table rows, PostgreSQL may prefer sequential scans over index scans, regardless of the index type. This is expected behavior. The query planner chooses the most efficient execution strategy based on cost estimates.
From a maintenance perspective, although GiST indexes handle updates more efficiently than GIN indexes, they still require more overhead than B-tree indexes for write operations. The index structure must be updated for every INSERT, UPDATE, or DELETE operation affecting indexed columns.
Finally, btree_gist has operator limitations. The extension supports equality (=) and comparison operators (<, <=, >, >=) for scalar types, but doesn’t support all operators available with native B-tree indexes. For example, pattern matching operators like LIKE or ILIKE are not supported within GiST indexes.
Before implementing these indexes in production, test with your actual workload, not just individual queries that benefit from the index. An index that improves one query may degrade overall performance because of write overhead across your entire workload.
Consider using Amazon Aurora Fast Database Cloning to create test environments where you can evaluate the impact of different indexing strategies on overall performance without affecting production workloads.
Conclusion
In this post, we discussed specialized indexing techniques supported in Aurora PostgreSQL and Amazon RDS for PostgreSQL through extensions. SP-GiST indexes enable efficient spatial and partitioned search operations, btree_gin allows unified GIN indexing across array and scalar columns for read-heavy workloads, and btree_gist enables exclusion constraints and unified GiST indexing for high-update workloads with range overlap operations.
These extension-based indexes complement PostgreSQL’s native indexing capabilities by addressing specific use cases that standard B-tree, GIN, and GiST indexes don’t handle efficiently. When you implement these indexes, analyze the performance impact on your overall workload and test thoroughly before production deployment. You can use Aurora Fast Clone to create test environments for evaluating index strategies without affecting production systems.
In Part 4 of this series, we discuss specialized text search indexes supported through database extensions like Bloom, pg_trgm, and pg_bigm. To learn more, see the Amazon Aurora PostgreSQL documentation and the Amazon RDS for PostgreSQL documentation.