Working with identity columns and sequences in Aurora DSQL

Amazon Aurora DSQL now supports PostgreSQL-compatible identity columns and sequence objects, so developers can generate unique integer identifiers with configurable performance characteristics optimized for distributed workloads.

In distributed database environments, generating unique, sequential identifiers is a fundamental challenge: coordinating across multiple nodes creates performance bottlenecks, especially under high concurrency workloads. In this post, we show you how to create and manage identity columns for auto-incrementing IDs, selecting between identity columns and standalone sequence objects, and improving cache settings while choosing between UUIDs and integer sequences for your workload requirements.

Solution overview

Aurora DSQL implements identity columns and sequence objects as distributed coordination primitives that generate unique integer values across the cluster. Identity columns provide streamlined syntax for the common auto-incrementing ID use case, while standalone sequences offer advanced control for complex scenarios.

Prerequisites

You must have the following prerequisites to follow along with this post.

AWS account requirements:

Active AWS account with Aurora DSQL cluster provisioned
Aurora DSQL cluster running
Aurora DSQL requires all connections to use Transport Layer Security (TLS) encryption. To establish secure connections, your client system must trust the Amazon Root Certificate Authority (Amazon Root CA 1). This certificate is pre-installed on many operating systems.

Required permissions:

CREATE permission on schema for creating sequences and tables
USAGE or UPDATE permission on sequences for generating values
ALTER permission for modifying existing sequences or identity columns

Tools and software needed:

PostgreSQL-compatible client (psql, DBeaver, or application database driver)
SQL client configured to connect to Aurora DSQL endpoint

Create a table with an identity column

Selecting an appropriate cache value is an important part of using sequences and identity columns effectively. The cache setting determines how identifier allocation behaves under load, influencing both system throughput and how closely values reflect allocation order. Higher cache values allow for higher throughput by serving sequence numbers from locally pre-allocated ranges, but this efficiency comes with a trade-off: any reserved values that go unused are lost, making gaps and out-of-order allocation more apparent.

To maximize achievable throughput under concurrency, use a larger cache size, for example:

CREATE TABLE orders (
    order_id BIGINT GENERATED ALWAYS AS IDENTITY
        (CACHE 70000),
    customer_id BIGINT NOT NULL,
    order_date DATE DEFAULT CURRENT_DATE,
    total_amount DECIMAL(10,2)
);

To minimize gaps in the sequence ordering, use a cache size of 1, for example:

CREATE TABLE invoice_numbers (
    invoice_id BIGINT GENERATED ALWAYS AS IDENTITY
        (CACHE 1),
    invoice_date DATE NOT NULL,
    customer_name TEXT
);

Key configuration decisions:

GENERATED ALWAYS: System controls all values; users can’t override
GENERATED BY DEFAULT: Users can provide explicit value when needed
CACHE 70000: Optimized for high-volume concurrent inserts
CACHE 1: Closer to sequential ordering, lower throughput

Inserting data using identity columns

In this section, we show some examples of how identity columns work during INSERT statements.

Automatic value generation example

INSERT INTO orders (customer_id, total_amount)
VALUES (12345, 299.99);

Multiple inserts example

INSERT INTO orders (customer_id, total_amount)
VALUES
(12346, 149.50),
(12347, 599.00),
(12348, 89.99);

Using the DEFAULT keyword explicitly example

Alternatively, the keyword DEFAULT can be specified in place of a value to explicitly request the sequence-generated value.

INSERT INTO orders
    (order_id, customer_id, total_amount)
    VALUES (DEFAULT, 12349, 449.99);

Overriding GENERATED ALWAYS (when necessary) example

Only works with OVERRIDING SYSTEM VALUE clause.

INSERT INTO orders
    (order_id, customer_id, total_amount)
    OVERRIDING SYSTEM VALUE
    VALUES (999999, 12350, 1299.99);

Create and use standalone SEQUENCE objects

While identity columns provide a convenient built-in solution for auto-incrementing values, there are situations where creating standalone sequences offers greater flexibility and control. Standalone sequences are particularly useful in three scenarios:

When you need to share a single sequence across multiple tables to maintain a unified numbering system
When you require ID generation outside of INSERT operations (such as pre-generating identifiers for batch processing or external systems)
When you have advanced sequence manipulation requirements that go beyond auto-increment functionality

Create a sequence with a large cache for high concurrency example

CREATE TABLE customers (
    customer_id BIGINT GENERATED ALWAYS AS IDENTITY
        (CACHE 65536) PRIMARY KEY,
    customer_name VARCHAR(100) NOT NULL,
    email VARCHAR(255) UNIQUE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE SEQUENCE customer_seq
START WITH 1000
INCREMENT BY 1
CACHE 100000;

Use Sequence in INSERT example

INSERT INTO customers
    (customer_id, customer_name, email)
    OVERRIDING SYSTEM VALUE
    VALUES (
        nextval('customer_seq'),
        'Jane Smith',
        'jane@example.com'
    );

Generate value in application logic example

SELECT nextval('customer_seq');

Query sequence and identity column state

DSQL provides several system views and functions to monitor sequence behavior and inspect their current state. You can check sequence parameters, view current values, and examine identity column configurations to understand how your auto-increment mechanisms are functioning.

View current sequence values, as an example

Check sequence parameters and current state.

SELECT * FROM pg_sequences WHERE sequencename = 'customer_seq';

schemaname | sequencename | sequenceowner | data_type | start_value | min_value | max_value | increment_by | cycle | cache_size | last_value 
------------+--------------+---------------+-----------+-------------+-----------+---------------------+--------------+-------+------------+------------
public | customer_seq | admin | bigint | 1 | 1 | 9223372036854775807 | 10 | f | 200000 | 2009990

Get current value in this session

First get the nextval and then get currval.

SELECT nextval('customer_seq');

 nextval
---------
 2010000
(1 row)


SELECT currval('customer_seq');

 currval
---------
 2010000
(1 row)

View all sequences.

SELECT schemaname, sequencename,
       last_value, cache_size
FROM pg_sequences;

 schemaname |            sequencename            | last_value | cache_size
------------+------------------------------------+------------+------------
 public     | customers_customer_id_seq          |            |      65536
 public     | orders_order_id_seq                |     419999 |     150000
 public     | customer_seq                       |    4009990 |     200000
 public     | event_log_event_id_seq             |            |     100000
 public     | customer_accounts_account_number_seq |           |          1
 public     | order_display_seq                  |            |      70000
 public     | customer_new_seq                   |            |     100000

Inspect identity column properties

View identity column configuration

SELECT table_name, column_name,
       is_identity, identity_generation
FROM information_schema.columns
WHERE is_identity = 'YES';

Modify identity columns and sequences

After creating sequences or identity columns, you might need to adjust their behavior to meet changing requirements. With Aurora DSQL, you can modify various sequence properties including generation behavior, starting values, cache sizes, and increment steps.

You can use the ALTER TABLE/ALTER COLUMN commands to modify identity columns or ALTER SEQUENCE for standalone sequences.

Change identity column behavior example:

Switch from ALWAYS to BY DEFAULT to allow manual ID insertion during data migration or bulk imports.

ALTER TABLE orders
ALTER COLUMN order_id SET GENERATED BY DEFAULT;

Restart identity sequence at a specific value example

Reset the sequence after data cleanup or to reserve ID ranges for different purposes (for example, starting production IDs at 50000 after test data).

ALTER TABLE orders
ALTER COLUMN order_id RESTART WITH 50000;

Modify cache size of underlying sequence example

Increase cache size to improve INSERT performance in high-throughput applications by reducing sequence-related I/O operations.

ALTER TABLE orders ALTER COLUMN order_id SET CACHE 150000;

Modify standalone sequences examples

Change cache size to optimize performance for frequently accessed sequences in write-heavy workloads.

Change cache size

ALTER SEQUENCE customer_seq CACHE 200000;

Restart sequence

Restart sequence after data archival or to align numbering with business requirements (for example, starting the new fiscal year at 10000).

ALTER SEQUENCE customer_seq RESTART WITH 10000;

Change increment

Change increment to create gaps between IDs for organizational purposes (for example, reserving ranges for different regions or departments).

ALTER SEQUENCE customer_seq INCREMENT BY 10;

Validation and testing

After configuring identity columns and sequences, it’s important to verify that they’re generating values as expected. You can test the automatic ID generation by inserting records and using the RETURNING clause to see the generated values immediately.

Verify identity column behavior

Insert test records and observe generated IDs to confirm the sequence is working correctly and starting from the expected value.

INSERT INTO orders (customer_id, total_amount)
VALUES (99999, 100.00)
RETURNING order_id;

Verify sequential generation within session to ensure IDs increment properly with each subsequent insert.

INSERT INTO orders (customer_id, total_amount)
VALUES (99999, 200.00)
RETURNING order_id;

Test concurrent behavior (requires multiple sessions) examples

When multiple database sessions insert records simultaneously, sequence caching can affect the order in which IDs are generated. This example simulates concurrent inserts from two different sessions to demonstrate how cache settings impact ID generation patterns. Understanding this behavior is crucial for applications that require predictable ID ordering or need to optimize for high-concurrency workloads.

Session 1:

BEGIN;

INSERT INTO orders (customer_id, total_amount)
VALUES (88888, 150.00)
RETURNING order_id;

Take note of the returned ID.

order_id
----------
   420000

Session 2 (simultaneously):

BEGIN;

INSERT INTO orders (customer_id, total_amount)
VALUES (77777, 250.00)
RETURNING order_id;

Compare with Session 1 ID.

order_id
----------
570000

Expected behavior (with CACHE >= 65536):

Sequence values are unique but might not be generated in strictly increasing order across sessions, and gaps can occur when cached values go unconsumed. Consistent with PostgreSQL semantics for cached sequences under concurrent use, where both systems ensure distinct values without providing sequential ordering.

Expected behavior (with CACHE =1):

Sequence values follow allocation order more closely over time.

Consistent with PostgreSQL’s non-cached sequence behavior. Each session fetches one value at a time, so cross-session ordering is more predictable. To preserve strictly sequential ID vending you must set CACHE = 1 and concurrent requests must be avoided.

Implement workload-specific patterns

The recommended cache size for sequences depends on your specific application requirements. High-throughput systems benefit from larger caches to reduce database round-trips, while applications requiring strict sequential ordering need minimal or no caching. This section demonstrates three common patterns with different cache strategies.

High-volume event logging (CACHE >= 65536) example

Event logging systems prioritize write performance over sequential ordering because events are typically queried by timestamp rather than ID. A large cache size minimizes sequence-related overhead during burst traffic periods.

CREATE TABLE event_log (
    event_id BIGINT GENERATED ALWAYS AS IDENTITY
        (CACHE 100000),
    event_type TEXT NOT NULL,
    event_timestamp TIMESTAMPTZ
        DEFAULT CURRENT_TIMESTAMP,
    event_data TEXT
);

Account number generation (CACHE = 1) example

Financial or customer-facing account numbers often require predictable, gap-free sequential numbering for auditing, compliance, or customer service purposes. Setting CACHE = 1 allocates IDs in order, though at the cost of performance.

CREATE TABLE customer_accounts (
    account_number BIGINT GENERATED ALWAYS AS IDENTITY
        (CACHE 1),
    account_type TEXT NOT NULL,
    created_date DATE DEFAULT CURRENT_DATE
);

Hybrid approach (UUID primary key + sequence display ID) example:

This pattern combines the benefits of both approaches: UUIDs provide globally unique, distributed, system-friendly primary keys, while a cached sequence generates user-friendly display numbers for customer-facing interfaces like order tracking.

CREATE SEQUENCE order_display_seq CACHE 70000;

CREATE TABLE orders_hybrid (
    order_uuid UUID PRIMARY KEY
        DEFAULT gen_random_uuid(),
    order_number BIGINT
        DEFAULT nextval('order_display_seq'),
    customer_id BIGINT NOT NULL,
    order_date DATE DEFAULT CURRENT_DATE
);

Best practices

Choosing CACHE size:

Use CACHE >= 65536 when:
- Identifiers generated at high frequency (thousands per second)
- Many concurrent sessions performing inserts
- Workload tolerates gaps in identifier values. The database always vends IDs in sequential, ascending order—apparent out-of-order allocation is a client-side observation, not a database behavior (this is also true of standard PostgreSQL, where client submission order does not determine ID assignment)
- Examples: IoT telemetry, event logs, job run IDs
Use CACHE = 1 when:
- Allocation rates are moderate (hundreds per second or less)
- Closer sequential ordering is important
- Minimizing gaps is a priority
- Examples: invoice numbers, account IDs, reference numbers

Identity columns compared to standalone sequences:

Prefer IDENTITY columns for: Single-table auto-incrementing IDs (most common use case)
Use standalone sequences for: Sharing sequences across tables, generating IDs outside INSERT operations

UUIDs compared to integer sequences:

Default to UUIDs for: Primary keys in high-scale distributed workloads
Use integer sequences for: Human-readable display IDs, reporting, external integrations
Consider a hybrid approach: UUID primary key + sequence-based display number

Managing gaps:

Gaps are inevitable due to transaction rollbacks, cached value loss, and concurrent allocation
Applications should be designed to handle non-contiguous ID values. The database always vends IDs in sequential, ascending order so gaps arise from external factors (rollbacks, cache loss, concurrency) rather than out-of-order vending
Never rely on sequences for gapless numbering

Performance improvement:

Larger cache sizes reduce coordination overhead but increase visible gaps
Use explicit transactions when ordering matters within a single session
Monitor sequence allocation patterns under production load

Troubleshooting examples

Issue: “Permission denied for sequence”

Grant necessary permissions.

GRANT USAGE ON SEQUENCE customer_seq TO application_user;
GRANT UPDATE ON SEQUENCE customer_seq TO application_user;

Issue: “Cannot insert explicit value for identity column”

Use OVERRIDING SYSTEM VALUE for GENERATED ALWAYS columns.

INSERT INTO orders (order_id, customer_id, total_amount)
OVERRIDING SYSTEM VALUE
VALUES (12345, 99999, 500.00);

Or change to GENERATED BY DEFAULT.

ALTER TABLE orders
ALTER COLUMN order_id SET GENERATED BY DEFAULT;

Issue: Large gaps in sequence values.

Expected behavior with CACHE >= 65536
Occurs when cached values aren’t fully consumed before the session ends
Consider reducing cache size if gaps are problematic for your use case

Issue: Sequence values not strictly increasing across sessions

Expected behavior with CACHE >= 65536 in distributed systems
Use CACHE = 1 if strict ordering is required
Use explicit transactions for more predictable ordering within sessions

Clean up

If you created resources to follow along with identity column tutorials or need to clean up your database schema, this guide walks you through the process of removing identity column behavior from your database objects. Whether you need to drop identity properties while preserving your data, remove sequences, or clean up entire tables, you’ll find the SQL commands and syntax you need here. We’ll also cover important cost implications and system limits to help you manage your database resources efficiently.

Remove identity column behavior:
- Drop identity property (keeps column and data)
```
ALTER TABLE orders
ALTER COLUMN order_id DROP IDENTITY;
```
Drop sequences:
- Drop standalone sequence
```
DROP SEQUENCE customer_seq;
```
- Drop sequence and dependent objects
```
DROP SEQUENCE customer_seq CASCADE;
```
Drop tables with identity columns
- Automatically drops associated sequence
```
DROP TABLE orders;
```

Cost implications:

No additional charges for sequences or identity columns
Dropping unused sequences reduces metadata overhead (5,000 sequence limit per cluster)

Conclusion

In this post, we explored how identity columns provide PostgreSQL-compatible auto-incrementing IDs improved for distributed workloads. We demonstrated that standalone sequences offer advanced control for complex scenarios, while UUIDs remain the recommended choice for primary keys, with integer sequences better suited for display IDs and human-readable identifiers. Finally, we discussed how gaps and ordering effects are inherent characteristics of distributed sequence allocation that developers should account for in their application design.

Ready to experiment with identity columns and sequences? Visit the Aurora DSQL Playground to run these examples in a live environment. You can test different cache settings, observe concurrent behavior, and explore how sequences work in real-time without setting up your own database.

AWS Database Blog