AWS Database Blog

Pagination patterns in Amazon Aurora DSQL

Pagination patterns in Amazon Aurora DSQL control how fast your pages load, how much you spend, and how reliably your transactions complete. Without the right pattern, your application slows down as you page deeper into results, your serverless bill grows with each wasted query, and your transactions fail more often under concurrency pressure.

In this post, you learn three pagination techniques for Aurora DSQL: OFFSET/LIMIT, cursor-based (keyset), and temporal. You implement keyset pagination in SQL and Python, build it into an API layer, optimize with composite indexes, handle batch processing within the 3,000-row transaction limit, and avoid five common anti-patterns. By the end, you can choose the right pagination method for your workload and implement it with confidence.

Aurora DSQL is a serverless, distributed SQL database designed for resilience at every scale. It can span multiple Availability Zones within an AWS Region, or span multiple Regions for workloads that require the highest levels of availability. Aurora DSQL meters all request-based activity (query processing, reads, and writes) using a single normalized unit called a Distributed Processing Unit (DPU). Inefficient pagination means more DPUs consumed and a higher bill.

Solution overview

In Aurora DSQL, when a query scans a large result set only to discard most rows, you consume DPUs that produce no useful results. Pagination limits each request to the data you actually need, reducing both latency and cost. The following diagram shows how different pagination patterns work in Aurora DSQL.

Aurora DSQL characteristics that influence pagination design

The following Aurora DSQL characteristics affect how you design pagination:

  • Optimistic concurrency control (OCC) – With the lock-free architecture of Aurora DSQL, your transactions don’t block each other during execution. The system evaluates conflicts at commit time, so long-running transactions increase the window for conflicts.
  • Transaction row limit – A single transaction can modify up to 3,000 rows (INSERT, UPDATE, DELETE). If a single transaction exceeds 3,000 modified rows, it fails, which causes batch processing to fail.
  • Asynchronous DDL – You can create indexes without blocking reads or writes. Data definition language (DDL) operations like CREATE INDEX ASYNC run in the background, so your table remains fully available during index builds.
  • Serverless billing – You pay per request, and poorly optimized queries increase your costs.
  • Repeatable Read isolation – The transaction isolation level uses PostgreSQL Repeatable Read.
  • No TRUNCATE support – Use DELETE FROM table_name (in batches for large tables) or DROP TABLE followed by CREATE TABLE instead.

Prerequisites

AWS account requirements

  • An active AWS account with permissions to create and manage Aurora resources.
  • An Aurora DSQL cluster already provisioned and configured.

Required permissions and IAM roles

  • AWS Identity and Access Management (IAM) permissions for Aurora DSQL cluster access.
  • A database user with appropriate SELECT privileges.

Tools and software

  • A SQL client compatible with Aurora (such as DBeaver or pgAdmin).
  • A programming language or framework of your choice. Examples in this post use Python.

Solution walkthrough

The following steps walk you through each pagination approach, from basic OFFSET/LIMIT to cursor-based patterns, with code examples and optimization strategies.

Step 1: Understanding pagination options in Aurora DSQL

In a traditional single-node PostgreSQL database, pagination works well for small datasets. You add LIMIT and OFFSET to a query, and it performs adequately for moderate data volumes. But in a distributed database like Aurora DSQL, the performance implications are more significant. Choosing the right pagination pattern affects your performance, your costs, and your application’s reliability.

The three main approaches are:

  • Traditional OFFSET/LIMIT: straightforward, but performance decreases with deeper pages.
  • Keyset pagination, also called cursor-based pagination: recommended for large datasets and APIs.
  • Temporal-based pagination: suited for time-series data (covered later in this post using timestamp-based keyset cursors).

Pattern 1: LIMIT/OFFSET pagination

OFFSET/LIMIT is a common pagination pattern in PostgreSQL, and Aurora DSQL fully supports it.

The following queries show how to paginate through an orders table, retrieving 20 records per page:

-- Page 1
SELECT id, name, created_at
FROM orders
ORDER BY created_at DESC
LIMIT 20 OFFSET 0;

-- Page 2
SELECT id, name, created_at
FROM orders
ORDER BY created_at DESC
LIMIT 20 OFFSET 20;

-- Page N
SELECT id, name, created_at
FROM orders
ORDER BY created_at DESC
LIMIT 20 OFFSET (N-1) * 20;

When to use it

  • Small to medium datasets (tens of thousands of rows).
  • Admin dashboards where you need to jump to arbitrary page numbers.
  • Prototyping and early development where simplicity matters most.

Trade-offs in Aurora DSQL

OFFSET pagination has a known trade-off: the database must scan and discard every row up to the offset before returning results. On page 1,000 with 20 rows per page, the database processes 20,000 rows to return 20. In the distributed architecture of Aurora DSQL, this cost grows in a serverless, pay-per-use model.

Recommendation: Use LIMIT/OFFSET when your total result set is small or when users rarely paginate beyond the first few pages. Set a maximum page depth to prevent runaway costs.

Step 2: Implementing cursor-based pagination

With keyset pagination, you use the values from the last row of the current page to fetch the next page. You tell the database “give me rows after this specific point” using the last row’s values as a cursor.

Schema design considerations

For keyset pagination to work correctly, your sort order must be deterministic. If created_at has duplicate values, you need a tiebreaker, typically the primary key. Improve write distribution and reduce OCC contention by using random primary keys (such as UUIDs) to spread writes across the cluster.

-- First page
SELECT id, name, created_at
FROM orders
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- Next page (using last row's values from previous result)
SELECT id, name, created_at
FROM orders
WHERE (created_at, id) < ('2026-03-01 10:30:00', 'ord_abc123')
ORDER BY created_at DESC, id DESC
LIMIT 20;

Why this is the recommended pattern for Aurora DSQL

  • Your query performs the same from page 1 to page 10,000, because the database uses an index seek to jump to the starting point.
  • Indexes provide consistent query performance through fast range scans on ordered columns.
  • Lower conflict potential. Keyset queries are fast and touch fewer rows, reducing the window for OCC conflicts.
  • Lower cost. Because the database processes and reads less data per query, your per-request DPU cost drops.

Creating supporting indexes

Aurora DSQL runs index creation asynchronously with CREATE INDEX ASYNC, so your table remains fully available for reads and writes during index builds. You can monitor index creation status using the sys.jobs system view.

-- Create an index to support the pagination query
-- Note that ASC/DESC not supported in Aurora DSQL indexes
-- Including projected columns with INCLUDE enables index-only scans
CREATE INDEX ASYNC idx_orders_created_id
ON orders (created_at, id) INCLUDE (name);

Step 3: Building pagination into your API layer

Your application encodes the cursor as an opaque token for use in RESTful API responses.

The following code shows how to encode and decode cursors, and implement a paginated API endpoint:

import base64
import json

def encode_cursor(created_at, order_id):
    payload = json.dumps({"c": str(created_at), "i": order_id})
    return base64.urlsafe_b64encode(payload.encode()).decode()

def decode_cursor(cursor):
    payload = json.loads(base64.urlsafe_b64decode(cursor.encode()))
    return payload["c"], payload["i"]

def get_paginated_orders(cursor=None, page_size=100):
    last_created_at, last_id = decode_cursor(cursor) if cursor else (None, None)
    query = "SELECT * FROM orders WHERE (created_at, id) < (%s, %s) ORDER BY created_at DESC, id DESC LIMIT %s"
    results = execute_query(query, (last_created_at, last_id, page_size + 1))
    has_next_page = len(results) > page_size
    items = results[:page_size] if has_next_page else results
    next_cursor = None
    if has_next_page:
        next_cursor = encode_cursor(items[-1]['created_at'], items[-1]['id'])
    return {
        'items': items,
        'next_cursor': next_cursor
    }

This code fetches page_size + 1 rows to determine if a next page exists without running a separate COUNT(*) query. execute_query() and execute() represent your database connection method (for example, using psycopg2 or your preferred PostgreSQL driver).

Error handling and edge cases

  • Handle expired or invalid cursors gracefully by returning the first page.
  • Validate cursor tokens before decoding to prevent injection attacks.
  • Return consistent error responses when page size limits are exceeded.

Step 4: Optimizing pagination performance

Index optimization

The Aurora DSQL PostgreSQL-compatible query planner can use composite indexes to serve filtered pagination queries, scanning only the relevant subset of data.

In the following example, we create an index and demonstrate how it optimizes filtered pagination queries:

-- Index for supporting filtered pagination queries
CREATE INDEX ASYNC idx_orders_status_date
ON orders (status, created_at, id) INCLUDE (name);

-- Paginated query with filter
SELECT id, name, created_at
FROM orders
WHERE status = 'active'
AND (created_at, id) < ('2026-03-01', 'ord_xyz')
ORDER BY created_at DESC, id DESC
LIMIT 20;

Note: You can use CREATE INDEX ASYNC in Aurora DSQL with standard index options including composite columns. Sort direction (ASC/DESC) in index definitions isn’t currently supported. NULLS FIRST/LAST works.

Additional optimization strategies

  • Use application-level caching (such as Amazon ElastiCache) to reduce repeated database hits for stable datasets.
  • Configure connection pooling to manage concurrent pagination requests with lower overhead.
  • Use BEGIN READ ONLY transactions for read-only pagination to signal that no write conflicts are possible.

Step 5: Handling complex sorting and filtering

Multi-column cursor implementation

For tables with UUID primary keys, pair a sortable column (like a timestamp) with the UUID as a tiebreaker:

CREATE TABLE events (
    id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    event_type TEXT NOT NULL,
    occurred_at TIMESTAMPTZ NOT NULL DEFAULT now(),
    payload TEXT
);

CREATE INDEX ASYNC idx_events_occurred_id
ON events (occurred_at, id) INCLUDE (event_type, payload);

-- Paginate using the composite cursor
SELECT id, event_type, occurred_at, payload
FROM events
WHERE (occurred_at, id) < ('2026-03-09T10:00:00Z', 'a1b2c3d4-...')
ORDER BY occurred_at DESC, id DESC
LIMIT 25;

Maintaining sort stability with non-unique columns

Include a unique column (such as the primary key) as a tiebreaker in your ORDER BY clause to guarantee stable pagination across pages.

Complex multi-column cursor example

SELECT * FROM products
WHERE (price > :last_price) OR (price = :last_price AND product_id > :last_id)
ORDER BY price ASC, product_id ASC
LIMIT 50;

Step 6: Batch processing with keyset iteration

For background jobs that need to process every row in a table (data migrations, exports, or analytics aggregation), keyset pagination is essential. OFFSET-based iteration over millions of rows would be prohibitively slow and expensive.

def batch_process_orders(batch_size=500):
    last_id = None
    while True:
        if last_id:
            query = """
                SELECT id, name, total FROM orders
                WHERE id > %s ORDER BY id LIMIT %s
            """
            rows = execute(query, [last_id, batch_size])
        else:
            query = """
                SELECT id, name, total FROM orders
                ORDER BY id LIMIT %s
            """
            rows = execute(query, [batch_size])
        if not rows:
            break
        process_batch(rows)
        last_id = rows[-1]["id"]

Aurora DSQL considerations for batch processing

  • Respect the 3,000-row transaction limit. Keep batch_size well under 3,000 for write operations.
  • Keep transactions short. Process each batch in its own transaction rather than wrapping the entire iteration in one.
  • Implement retry logic. If a batch encounters an OCC conflict (a serialization error), retry that specific batch. Design your batch operations to be idempotent so retries are safe.
  • Use READ ONLY transactions for read-only batch operations. This reduces locking overhead and allows Aurora DSQL to optimize query execution for better performance.
BEGIN READ ONLY;
SELECT id, name, total FROM orders
WHERE id > 'last_processed_id'
ORDER BY id
LIMIT 500;
COMMIT;

Testing and validation

  • Validate consistency across page boundaries by verifying no rows are skipped or duplicated between pages.
  • Compare pagination methods with large datasets using EXPLAIN ANALYZE and Amazon CloudWatch metrics to measure query execution times.
  • Load test both OFFSET and keyset approaches to confirm performance characteristics at scale.

Anti-patterns to avoid

1. COUNT(*) for total pages

-- Avoid this on large tables
SELECT COUNT(*) FROM orders WHERE status = 'active';

Counting every matching row requires a full scan. In a distributed database, this is expensive. Instead, use the “fetch N+1” technique to determine if more pages exist, and avoid displaying exact total counts.

2. Deep OFFSET pagination

-- This gets progressively slower and more expensive
SELECT * FROM orders ORDER BY created_at LIMIT 20 OFFSET 500000;

If users can jump to page 25,000 in your application, switch to keyset pagination or limit the maximum accessible page depth.

3. Long-lived pagination transactions

-- Do not hold a transaction open across multiple page fetches
BEGIN;
SELECT * FROM orders LIMIT 20 OFFSET 0;
-- ... user thinks for 30 seconds ...
SELECT * FROM orders LIMIT 20 OFFSET 20;
COMMIT;

Each page fetch should be an independent, short-lived transaction. Aurora DSQL’s OCC model evaluates conflicts at commit time. Long-running transactions increase the chance of conflicts and retries. Database connections also time out after 1 hour.

4. Paginating without a deterministic sort order

-- Non-deterministic: rows with the same created_at may shift between pages
SELECT * FROM orders ORDER BY created_at LIMIT 20 OFFSET 40;

Include a unique column (such as the primary key) as a tiebreaker in your ORDER BY clause.

5. Exceeding the transaction row limit in batch writes

-- This will fail if the batch exceeds 3,000 rows
BEGIN;
INSERT INTO orders_archive SELECT * FROM orders WHERE created_at < '2025-01-01';
DELETE FROM orders WHERE created_at < '2025-01-01';
COMMIT;

Aurora DSQL limits each transaction to modifying a maximum of 3,000 rows. Use keyset-based batch iteration with batch sizes well under 3,000, committing each batch in its own transaction.

Clean up

Delete the following resources to prevent incurring charges.

Resources to delete

  • Test tables and indexes created for pagination examples.
  • Test data generated for demonstrations.

Commands to run

DROP TABLE IF EXISTS events;
DROP INDEX IF EXISTS idx_orders_created_id;
DROP INDEX IF EXISTS idx_orders_status_date;
DROP INDEX IF EXISTS idx_events_occurred_id;

Conclusion

In this post, you learned how pagination in Aurora DSQL follows the same principles as distributed databases in general: minimize the work per query, keep transactions short, and use indexes effectively. The keyset pagination pattern works well for most use cases. It delivers consistent performance at every pagination depth, aligns with the Aurora DSQL OCC model, and keeps per-request DPU costs predictable as your data grows.

To get started, audit your existing OFFSET/LIMIT queries, identify your highest-traffic endpoints as candidates for keyset pagination, and monitor query latency and OCC conflict rates in Amazon CloudWatch to measure improvement. For more details, see the Amazon Aurora DSQL User Guide, Amazon Aurora DSQL pricing, and Optimistic concurrency control in Aurora DSQL.


About the authors

Sandhya Khanderia

Sandhya Khanderia

Sandhya is a Sr. Technical Account Manager and Data Analytics specialist and Amazon Quick SME. She works with AWS Enterprise customers and provides ongoing support and technical guidance to help plan and build solutions using best practices and proactively keep customers’ AWS environments operationally healthy.

Anatoliy Kochetov

Anatoliy Kochetov

Anatoliy is a Technical Account Manager and cloud infrastructure specialist with a background in DevOps and PostgreSQL. He is an AWS Certified Solutions Architect and works with AWS Enterprise customers, providing ongoing support and technical guidance to help plan and build solutions using best practices.

Dhvani Shah

Dhvani Shah

Dhvani is a Technical Account Manager supporting enterprise customers with strategic guidance on cloud resiliency, security, and generative AI adoption. She helps organizations architect scalable, secure solutions that accelerate digital transformation. Outside work, she enjoys traveling, playing badminton, and spending time with family.