AWS Database Blog

Implement UUIDv7 in Amazon RDS for PostgreSQL using Trusted Language Extensions

Universally unique identifiers (UUIDs) are 128-bit values which are designed to be unique without the need of a central authority to manage them. This makes them ideal for the purpose of primary keys in a database, particularly for distributed systems. Historically, one of the most popular methods of generating UUIDs was UUID Version 4 (UUIDv4), which are non-sequential, randomly -generated UUIDs, due to the simplicity of adding them to applications.

The choice of a primary key can have an impact on application performance based on workload patterns. Although UUIDv4 is convenient when serving as a database primary key, its randomness could cause sub-optimal performance when data is being accessed or inserted into a database. Ideally, we would like commonly -accessed data to be clustered together in a database index such as B-tree and its variants. This increases the chance of them being present in the database memory when we query them consecutively, thereby reducing the likelihood of having to fetch data from disk, which is a more time-consuming operation. This issue, known as “temporal locality,” can be observed through a database’s “cache hit ratio“. If an index has a large, random distribution of data, it can negatively impact the cache hit ratio and cause a negative impact on overall performance.

One way to solve the issues due to the randomness of UUIDv4 is to use a primary key that groups commonly accessed data together. One method is to use a timestamp to help co-locate the data, as many applications will more frequently query data that’s created around the same time.

UUID Version 7 (UUIDv7) was introduced to improve the randomness of UUIDv4. UUIDv7 encodes a Unix timestamp with millisecond precision in the first 48 bits of the UUID, meaning that UUIDv7 is time-based and sequential. This eliminates the drawback of poor temporal locality when using UUIDv4. Insertions of data into a database index are clustered based on their sequence of creation. Similarly, retrieving data associated with a specific time also allows you to easily access data that were created close to that time.

Today, PostgreSQL provides built-in support for generating UUIDs that use the UUIDv1, UUIDv3, UUIDv4, and UUIDv5 implementations. However, as of this writing, PostgreSQL doesn’t natively support UUIDv7, but you can add UUIDv7 support today using Trusted Language Extensions for PostgreSQL (pg_tle). Additionally, should PostgreSQL support UUIDv7 in the future, this implementation will work with both older and newer versions of PostgreSQL.

Trusted Language Extensions (pg_tle) for PostgreSQL is a new open source development kit to help you build high performance extensions that run safely on PostgreSQL. It allows developers to create high performance database extensions using popular trusted languages, such as PL/Rust, JavaScript, Perl, and PL/pgSQL. In this post, we demonstrate how to create and install a Trusted Language Extension (TLE) using PL/Rust as the trusted language to generate a UUIDv7. We also take a deeper look into the underlying implementation of the extension.

Solution overview

Amazon Relational Database Service (Amazon RDS) for PostgreSQL supports Trusted Language Extensions and PL/Rust extensions. PL/Rust enables developers to build safe and performant database functions in the Rust programming language. In this example, we install an UUIDv7 TLE that contains three PL/Rust functions to cover the following use cases:

  • Generate a UUIDv7
  • Generate a UUIDv7 from a user-provided timestamp
  • Extract the timestamp from a UUIDv7 as a PostgreSQL timestamp with time zone (timstamptz) type

These three database functions are packaged into a TLE. You can start using UUIDv7 after installing the TLE in PostgreSQL.

Amazon RDS for PostgreSQL supports several other trusted programming languages, including PL/pgSQL, PL/Perl, PL/v8 (JavaScript), and PL/Tcl.

Prerequisites

To run the example in this post, you need to provision an RDS for PostgreSQL instance or Multi-AZ DB cluster running PostgreSQL 16.1 or higher, 15.2 or higher, 14.9 or higher, or 13.12 or higher. Additionally, you need to add pg_tle and plrust to the shared_preload_libraries parameter into a DB parameter group and assign the DB parameter group to your PostgreSQL database instance. You can also directly create a PostgreSQL database instance with the parameter group.

You can use the AWS Command Line Interface (AWS CLI) to create a DB parameter group and add pg_tle and plrust to the shared_preload_libraries parameter:

REGION="us-east-1"

aws rds create-db-parameter-group \
  --db-parameter-group-name pg16-plrust \
  --db-parameter-group-family postgres16 \
  --description "Parameter group that contains PL/Rust settings for PostgreSQL 16" \
  --region "${REGION}"

aws rds modify-db-parameter-group \
  --db-parameter-group-name pg16-plrust \
  --parameters "ParameterName='shared_preload_libraries',ParameterValue='pg_tle,plrust',ApplyMethod=pending-reboot" \
  --region "${REGION}"

If you modify the shared_preload_libraries parameter on an existing database instance, the changes will not take effect until the instance is rebooted. You can also modify the parameter group directly from the AWS Management Console. If you create an instance directly with your custom parameter group, you can start using pg_tle and plrust when the instance is available. For more information, see Working with DB parameter groups.

Install the UUIDv7 trusted language extension

In the Trusted Language Extensions open source Github repository, we provide a way for you to install the extension. After you register the UUIDv7 extension into pg_tle, you can check if the extension is available in your desired database with the following command:

SELECT * FROM pgtle.available_extensions();

You will see the following output:

  name   | default_version |        comment
---------+-----------------+-----------------------
 uuid_v7 |       1.0       | extension for uuid v7
(1 row)

Next, install the plrust extension in your desired database with the following command:

CREATE EXTENSION plrust;

You will see the following output:

CREATE EXTENSION

Finally, install the UUIDv7 extension in your desired database with the following command:

CREATE EXTENSION uuid_v7;

You will see the following output:

CREATE EXTENSION

Now you can start using the uuid_v7 extension.

Generate a UUIDv7

You can generate a UUIDv7 with the following command:

SELECT generate_uuid_v7();

You will see output similar to the following:

           generate_uuid_v7
--------------------------------------
 018d803a-da84-77f3-839d-24347c51137e
(1 row)

You could a see a different UUIDv7 in your environment because UUIDv7s are random, unique, and time-based identifiers by nature.

Generate a UUIDv7 with a given timestamp

You might want to generate a UUIDv7 for data that is associated with a timestamp in the past or future. You can generate a UUIDv7 with a given timestamp using the following command:

SELECT timestamptz_to_uuid_v7('2024-02-07 20:29:26.776+00');

You will see the following output:

        timestamptz_to_uuid_v7
--------------------------------------
 018d8542-d778-7c51-bdb9-0ba5e494f3ef
(1 row)

You could a see a different UUIDv7 in your environment because UUIDv7s are random, unique, and time-based identifiers by nature.

Extract timestamp from a UUIDv7

You can extract the timestamp with a given UUIDv7 with the following command:

SELECT uuid_v7_to_timestamptz('018d8542-d778-72fe-9e66-8c8878dc53b5');

You will see the following output:

   uuid_v7_to_timestamptz
----------------------------
 2024-02-07 20:29:26.776+00
(1 row)

A deeper look into the extension

Let’s take a deeper look into the implementation of the extension. As we stated earlier, the extension provides three functions to generate a UUIDv7 and extract timestamp from a given UUIDv7. We use the Rust programming language in this solution because of its emphasis on memory safety and comparable performance with the C programming language. For more information about PL/Rust, see the PL/Rust Guide.

Exploring the generate_uuid_v7 function

All UUIDs are 128-bit long. A UUIDv7 is generated by encoding the Unix timestamp with millisecond precision in the first 48 bits. The remaining bits are used to store the version and variant of the UUID, and randomly -generated bits for uniqueness. The full implementation is available in our Github repository.

Let’s break down the generate_uuid_v7 function into different parts. First, you need to declare any dependencies that you want use in the PL/Rust function. This function uses the Rust rand crate as a dependency. This crate is used to generate random bits and encode them into the UUIDv7:

[dependencies]
rand = "0.8.5"

Let’s look at the code section of the function. First, you define a Rust type alias UuidBytes to represent a vector of u8 with a fixed size of 16:

type UuidBytes = [u8; 16];

Next, you need to know the current timestamp for generating the UUIDv7. To retrieve this, you can use the clock_timestamp API provided by the pgrx framework which returns the timestamp as a Datum type. This API is equivalent to the clock_timestamp provided by PostgreSQL.

let now = pgrx::clock_timestamp();

You need to turn the Datum into a timestamp in millisecond precision. To do so, use the extract_part API provided by pgrx to determine the epoch (number of seconds since 1970-01-01 00:00:00 UTC), and multiply that by 1,000 to get the timestamp in millisecond precision as a u64. The extract_part API provided by pgrx is equivalent to the extract function provided by PostgreSQL.

let epoch_in_millis_numeric: AnyNumeric = now
    .extract_part(DateTimeParts::Epoch)
    .expect("Unable to extract epoch from clock timestamp")
    * 1000;

let epoch_in_millis_normalized = epoch_in_millis_numeric.floor().normalize().to_owned();
let millis = epoch_in_millis_normalized
    .parse::<u64>()
    .expect("Unable to convert from timestamp from type AnyNumeric to u64");

To generate random bytes, define a helper method that returns a vector of 16 random u8 using the rand Rust crate. Not all 16 u8 are needed; in practice, we only need 10 of them.

fn rng_bytes() -> [u8; 16] {
    rand::random()
}

Next, you break down the timestamp into a u32 and a u16 such that you can later encode it into the first 48 bits of a UUID. You also need to set the version and variant in the respective field in the UUID. To do this, our implementation uses the uuid Rust crate  as reference.[1]

fn encode_unix_timestamp_millis(millis: u64, random_bytes: &[u8; 10]) -> (u32, u16, u16, [u8; 8]) {
    let millis_high = ((millis >> 16) & 0xFFFF_FFFF) as u32;

    let millis_low = (millis & 0xFFFF) as u16;

    let random_and_version =
        (random_bytes[1] as u16 | ((random_bytes[0] as u16) << 8) & 0x0FFF) | (0x7 << 12);

    let mut d4 = [0; 8];

    d4[0] = (random_bytes[2] & 0x3F) | 0x80;
    d4[1] = random_bytes[3];
    d4[2] = random_bytes[4];
    d4[3] = random_bytes[5];
    d4[4] = random_bytes[6];
    d4[5] = random_bytes[7];
    d4[6] = random_bytes[8];
    d4[7] = random_bytes[9];

    (millis_high, millis_low, random_and_version, d4)
}

You can now construct a UuidBytes (a vector of 16 u8) by combining all the components from earlier. Similarly, the implementation uses the Rust crate uuid as reference.[1]

fn generate_uuid_bytes_from_fields(d1: u32, d2: u16, d3: u16, d4: &[u8; 8]) -> UuidBytes {
    [
        (d1 >> 24) as u8,
        (d1 >> 16) as u8,
        (d1 >> 8) as u8,
        d1 as u8,
        (d2 >> 8) as u8,
        d2 as u8,
        (d3 >> 8) as u8,
        d3 as u8,
        d4[0],
        d4[1],
        d4[2],
        d4[3],
        d4[4],
        d4[5],
        d4[6],
        d4[7],
    ]
}

Finally, you can construct a UUID using UuidBytes with the UUID::from_bytes API provided by pgrx.

Exploring the uuid_v7_to_timestamptz function

Next, let’s look at the uuid_v7_to_timestamptz function. You can find the full implementation of the function in the Github repository. This function accepts a UUID as an argument and returns the associated timestamp as a timetamptz PostgreSQL type.

Recall that a UUID is represented as a vector of 16 u8, which sums up to 128 bits. Because the timestamp is encoded in the first 48 bits of the UUIDv7, you will retrieve the first six u8. With the six u8 represented as a byte array in big endian format, you can then construct a u64 from it. This value represents the number of milliseconds between the timestamp since 1970-01-01 00:00:00 UTC.

At this point, you can convert this into a PostgreSQL timestamptz type using the to_timestamp PostgreSQL function. The pgrx framework provides an equivalent function called to_timestamp in Rust. Before that, you should divide the value by 1,000 as a Rust f64 type. You do this because the to_timestamp function in Rust accepts an f64 as argument, and the value needs to be converted from millisecond units to seconds.

CREATE FUNCTION uuid_v7_to_timestamptz(uuid UUID)
RETURNS timestamptz
as $$
    // The timestamp of the uuid is encoded in the first 48 bits.
    // To retrieve the timestamp in milliseconds, convert the first
    // six u8 encoded in Big-endian format into a u64.
    let uuid_bytes = uuid.as_bytes();
    let mut timestamp_bytes = [0u8; 8];
    timestamp_bytes[2..].copy_from_slice(&uuid_bytes[0..6]);
    let millis = u64::from_be_bytes(timestamp_bytes);

    // The postgres to_timestamp function takes a double as argument,
    // whereas the pgrx::to_timestamp takes a f64 as arugment.
    // Since the timestamp in uuid was computed from extracting the unix epoch
    // and multiplying by 1000, here we divide it by 1000 to get the precision we
    // need and convert into a f64.
    let epoch_in_seconds_with_precision = millis as f64 / 1000 as f64;

    Ok(Some(pgrx::to_timestamp(epoch_in_seconds_with_precision)))
$$ LANGUAGE plrust
STRICT VOLATILE;

Exploring the timestamptz_to_uuid_v7 function

Finally, let’s look at the timestamptz_to_uuid_v7 function. This function takes a timestamptz PostgreSQL type as argument and returns a UUIDv7 for the given timestamp. You can find the full implementation of the function in the Github repository.

The implementation of this function is similar to the generate_uuid_v7 function. The only difference between these two functions is that the timestamptz_to_uuid_v7 function accepts a given timestamp whereas generate_uuid_v7 uses the current timestamp to generate the UUIDv7. Therefore, you can directly generate a UUIDv7 with the given timestamptz instead of using the timestamp returned from the clock_timestamp API. The remaining implementation remains the same as generate_uuid_v7.

Comparing performance of generating UUIDv7 with PL/pgSQL

Let’s compare the performance between a solution using PL/pgSQL, our solution using PL/Rust, and the existing open source pg_uuidv7 extension.

In this test, you can use pgbench to measure the transactions per second to generate UUIDv7s for each option. Use the following command to continuously generate UUIDv7s with one concurrent database client for 30 seconds on an r6i.xlarge instance.

pgbench --client=1 -T 30

For PL/pgSQL, use the implementation from the following Github repository. For PL/Rust, use the generate_uuid_v7 function. For C, use the uuid_generate_v7 function from the open source pg_uuidv7 extension.

You can observe the following result from the PL/pgSQL implementation:

scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
duration: 30 s
number of transactions actually processed: 133893
number of failed transactions: 0 (0.000%)
latency average = 0.225 ms
initial connection time = 3.123 ms
tps = 4443.079779 (without initial connection time)

You can observe the following result from using the pg_uuidv7 extension which is written in C:

scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
duration: 30 s
number of transactions actually processed: 191788
number of failed transactions: 0 (0.000%)
latency average = 0.157 ms
initial connection time = 3.108 ms
tps = 6353.201681 (without initial connection time)

Finally, you can observe the following result from using the generate_uuid_v7 function implemented with PL/Rust:

scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
duration: 30 s
number of transactions actually processed: 203792
number of failed transactions: 0 (0.000%)
latency average = 0.147 ms
initial connection time = 3.031 ms
tps = 6793.741512 (without initial connection time)

For this experiment, the transactions per second (TPS) of using the function in PL/pgSQL is close to 30% less than the PL/Rust function. The TPS from using the function implemented with PL/Rust is comparable to that of using an UUIDv7 extension that was implemented in C.

Clean up

Uninstall the uuid_v7 extension with following command:

DROP EXTESNION uuid_v7 CASCADE;

You will see the following output:

DROP EXTENSION

You can delete your RDS for PostgreSQL instance if you no longer need to use it.

Conclusion

In this post, we covered some challenges that might arise from the lack of support for UUIDv7. We also covered how Trusted Language Extensions for PostgreSQL (pg_tle) can help you start using UUIDv7. We explored the various functionalities that are provided in our solution, and dived deep into how the solution is implemented and the rationale behind it. Finally, we compared the performance of solutions implemented in different languages, and demonstrated that our solution implemented with pg_tle and PL/Rust is able to perform at a level that’s comparable to the pg_uuidv7 PostgreSQL C extension.

We invite you to leave feedback in the comments section.


About the author

Anthony Leung is a Software Development Engineer focusing on managed PostgreSQL at AWS. He has worked on developing multiple features that are currently offered in Amazon RDS for PostgreSQL.

[1] The Uuid Project is copyright 2013-2014, The Rust Project Developers and copyright 2018, The Uuid Developer. Uuid code is licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.