Build faster with Amazon DynamoDB and PartiQL: SQL-compatible operations

In November 2020, we launched DynamoDB support for PartiQL. With PartiQL, you can codify your DynamoDB data operations using familiar SQL syntax and get the fast, consistent performance that DynamoDB customers have long depended on. In this post, we explain how DynamoDB’s PartiQL support helps new DynamoDB developers learn faster and provides existing DynamoDB developers with powerful new tools. We also share some example PartiQL statements and introduce the associated API actions. And we show these in use with a popular AWS SDK.

Why use a SQL interface?

DynamoDB is purpose-built to provide consistent fast performance, high availability, and scaling elasticity with a flexible schema, support for nested data, and a serverless infrastructure experience. When we launched DynamoDB back in 2012, there was no standardized query language suitable for DynamoDB, and creating our own DynamoDB-specific SQL dialect did not feel like the right approach. To be clear, this was not a dogmatic decision—forcing a space into the term “NoSQL” was never the goal.

Since then, developers learned to use the DynamoDB-specific interface, but many customers continue to ask for the familiarity of SQL as they build new applications and (especially) as they migrate existing workloads from relational databases where SQL is standard. Now you can use a familiar interface for data operations, and build faster – without compromising the characteristics of DynamoDB.

With the August 2019 announcement of PartiQL, AWS introduced an open-source, SQL-compatible query language that makes it easy to work with data across differing indexed stores, regardless of the level of structure. PartiQL provides extensions to SQL, which intuitively allows for semi-structured (document form) data storage with nesting. AWS data services and features including Amazon S3 Select, Amazon Redshift, and Amazon Quantum Ledger Database (Amazon QLDB) support PartiQL.

For development teams with varying levels of DynamoDB experience but a solid foundation in SQL, DynamoDB’s PartiQL support provides a familiar path to interface proficiency and a common basis for understanding and maintaining data operations in code.

You can mix and match use of the original DynamoDB API actions and the new PartiQL API actions as you wish, to best meet your application needs. Now, let’s work through some common database operations using PartiQL statements for DynamoDB.

Example PartiQL statements for DynamoDB access

To demonstrate the possibilities with PartiQL statements and DynamoDB, we use some simple data scenarios. You can create the tables as described, then copy and paste the PartiQL statements into the PartiQL editor on the DynamoDB console to see how they work. Remember to delete these resources if they are no longer needed so you are no longer billed for them.

It’s important to note that the PartiQL syntax follows SQL, where double quotation marks ("...") denote a quoted identifier, and single quotation marks ('...') denote a string. When we INSERT a new item, the VALUE specified is a document—a lot like JSON, except that we defer to SQL syntax and use single quotes around attribute names and string values.

Working with a simple primary key table

The purchases table stores a historical record of various types of purchases, and it has a simple primary key; the partition key attribute is a string named purchaseId. We start by inserting two records—the first is for a purchase for which we know only the price, but for the second we have some further information about the transaction, including a list, a map, and a string set (visit the Amazon DynamoDB Developer Guide to learn more about the supported data types).

INSERT INTO purchases VALUE { 'purchaseId' : 'purchase#1', 'price' : 73.82}

INSERT INTO purchases VALUE { 'purchaseId' : 'purchase#2', 'price' : 27.94, 'version' : 1, 'vendorId' : 'store#7', 'mylist' : [ 'Cookies', 'Coffee', 5.99 ], 'mymap' : { 'Pens' : 43, 'Status' : 'IN-STOCK'}, 'mystringset' : << 'Red', 'Green', 'Blue' >>}

Next, we read those records—for the first case, we return all fields to the client, but in the second we dig into the document to return data selectively:

SELECT * FROM purchases WHERE purchaseId = 'purchase#1'

SELECT price,mylist[1],mymap.Pens,mystringset FROM purchases WHERE purchaseId = 'purchase#2'

We can use UPDATE and DELETE statements to make changes to these records. The first example in the following code is an unconditional update to an attribute value in an existing item—we update the price for the purchase#1 item. The second is conditional—we update the purchase#2 item to add a value to our list, add an attribute-value pair to our map, add a new string to our string set, and we increment the version—but only if the version attribute has the value we expect. This might represent a multi-version concurrency control pattern, essentially an optimistic lock.

UPDATE purchases SET price = '84.33' WHERE purchaseId = 'purchase#1'

UPDATE purchases SET version = version + 1 SET mylist = list_append(mylist,['Music']) SET mymap.Pencils = 27 SET mystringset = set_add(mystringset,<<'AWS-Orange'>>) WHERE purchaseId = 'purchase#2' AND version = 1

Now that we’re done with these items, let’s delete them:

DELETE FROM purchases WHERE purchaseId = 'purchase#1'

DELETE FROM purchases WHERE purchaseId = 'purchase#2'

Working with a composite primary key table and secondary index

The addresses table is where we store shipping addresses of customers and it has a composite primary key; the partition key attribute is a string named custId and the sort key attribute is a string named addressId. To allow querying for customer shipping addresses by ZIP code, the addresses table has a global secondary index (GSI) named zipIndex. The zipIndex indexing key is also composite—the partition key attribute is custId (string) and sort key attribute is zip (number).

In the following code, we insert a few shipping addresses: two for customer 897 and one for customer 234:

INSERT INTO addresses VALUE { 'custId' : 'cust#897', 'addressId' : 'addr#1', 'street' : '111 First St', 'city' : 'Anytown', 'zip' : 34512}

INSERT INTO addresses VALUE { 'custId' : 'cust#897', 'addressId' : 'addr#2', 'street' : '222 Second St', 'city' : 'Otherville', 'zip' : 98721, 'type' : 'TEMPORARY'}

INSERT INTO addresses VALUE { 'custId' : 'cust#234', 'addressId' : 'addr#1', 'street' : '333 Third St', 'city' : 'Otherville', 'zip' : 98721}

Now we can use the indexing to efficiently retrieve a single address record by fully specifying the primary key with WHERE clauses:

SELECT * FROM addresses WHERE custId = 'cust#897' AND addressId = 'addr#2'

Alternatively, we can retrieve all the addresses for a particular customer by placing no condition on the value of the sort key:

SELECT * FROM addresses WHERE custId = 'cust#897'

We can also use our secondary index to retrieve a customer’s shipping addresses in a particular ZIP code or ZIP prefix as a range. In the following code, we specify the index to query by giving the table name and the secondary index name separated by a period:

SELECT * FROM addresses.zipIndex WHERE custId = 'cust#897' AND zip = 34512

SELECT * FROM addresses.zipIndex WHERE custId = 'cust#897' AND zip BETWEEN 34500 AND 34599

Scanning a table

All the examples for SELECT queries so far have narrowed the request by specifying a value of partition key with a WHERE clause. Without this specificity, DynamoDB makes paginated reads of all items in the table—a scan operation. A table scan is a very powerful tool for particular requirements, but if you’d like to limit scan operations and encourage the use of indexes, you can apply a suitable AWS Identity and Access Management (IAM) policy. For more information, see Example: Allow Select statements and deny full table scan statements in PartiQL for DynamoDB.

The following PartiQL statement results in a table scan—up to 1MB of results are returned in each page, and the client can choose to read subsequent pages with additional requests if desired. The scan example applies a WHERE clause to filter the results by value of a non-index attribute and returns only a subset of the items that were scanned—this doesn’t reduce the read unit consumption associated with scanning a full page.

SELECT * FROM addresses WHERE street = '222 Second St'

A scan can be the efficient choice for occasional requirements, or where minimal filtering is applied and most of the items to be read provide value. For frequent use and more selective filtering, it may be better to build an index to optimize the reads.

New possibilities for querying DynamoDB

In addition to bringing a familiar SQL syntax, PartiQL actually enables powerful new read patterns to help solve for additional design requirements. To illustrate what we mean, let’s start by loading some additional items.

INSERT INTO purchases VALUE { 'purchaseId' : 'purchase#5', 'price' : 19.74}

INSERT INTO purchases VALUE { 'purchaseId' : 'purchase#3', 'price' : 18.05}

INSERT INTO purchases VALUE { 'purchaseId' : 'purchase#4', 'price' : 9.17}

Now that we’ve loaded our data, we can SELECT across multiple items or item collections using an IN clause, and sort the results in ascending or descending order for the partition key value and the sort key value (independently).

SELECT * FROM purchases WHERE purchaseId IN ['purchase#5', 'purchase#3', 'purchase#4'] ORDER BY purchaseId ASC

SELECT * FROM addresses.zipIndex WHERE "custId" IN ['cust#897', 'cust#234'] ORDER BY custId ASC, zip DESC

A DynamoDB item collection is like materializing your JOIN – we scale it horizontally with storage instead of compute, and you can selectively update parts of your item collection, in addition to making a selective Query over sorted segments. If you’ve modeled data this way before, you may have wished for a batched Query operation. DynamoDB’s PartiQL support delivers this via the IN clause. For example, you can now make a single request to DynamoDB to return all the items with sort key value greater than the current date, for item collections defined by partition key values in the set that you choose.

PartiQL in the DynamoDB APIs, AWS CLI, and SDKs

Three new low-level API actions bring PartiQL support to DynamoDB: ExecuteStatement, BatchExecuteStatement, and ExecuteTransaction. Each of these actions requires PartiQL statements and an optional ordered set of parameter values. Matching commands now are available in the AWS Command Line Interface (AWS CLI), and we have added low-level API support to SDKs for all the popular programming languages. Supported PartiQL statements include SELECT, UPDATE, INSERT, and DELETE directives.

ExecuteStatement

Use the ExecuteStatement API action to read items from a table or write a single item specified by its primary key. If you’re familiar with the existing API actions of DynamoDB, you can consider ExecuteStatement to be a PartiQL-enabled encapsulation of the already existing GetItem, PutItem, UpdateItem, DeleteItem, and Query actions. If the SELECT statement doesn’t include a WHERE clause to specify the value of the partition key attribute, the read is an unindexed scan—similar to the original Scan API action. This API action is paginated, and you can use a condition on an IAM policy to accept or deny unindexed scans according to your requirements.

BatchExecuteStatement

The BatchExecuteStatement API action brings PartiQL statement support for batched reads and batched writes. You might be familiar with the long-supported BatchGetItem and BatchWriteItem API actions for DynamoDB. The BatchExecuteStatement action provides similar functionality. It supports a batch of up to 25 items for reads (selects) or up to 25 items for writes (inserts, updates, or deletes). Batched statements must reference items by fully specifying their primary key with WHERE clauses. Batched update and delete operations can be conditioned on values of non-key attributes.

ExecuteTransaction

DynamoDB supports coordinated multi-item actions—consistent reads, condition checks, inserts, updates, or deletes—as a singular logical business operation. The ExecuteTransaction API action supports up to 100 PartiQL statements that specify the primary key for an item. The statements must all represent reads (SELECT) or writes and condition checks (INSERT, UPDATE, DELETE, and EXISTS). All statements must succeed, or they all fail. For more information about DynamoDB transactions, see Amazon DynamoDB Transactions.

Benefits of BatchExecuteStatement

You can benefit from the extended support of batched writes in the new BatchExecuteStatement API action. Allowing for update statements in the batch and offering conditional support for both updates and deletes adds powerful functionality. In one efficient API call, you can mutate multiple items—each succeeding or failing based on specific requirements for non-key attribute values.

The following examples show BatchExecuteStatement in action by using the AWS CLI and an input JSON file.

The following command line uses the AWS CLI to create a BatchExecuteStatement call to DynamoDB, with a set of PartiQL statements read from a JSON file:

aws dynamodb batch-execute-statement -–statements file://partiqlbatch.json

The JSON file read by our command line (partiqlbatch.json) contains the following PartiQL statements formatted within a JSON structure:

[
    {
        "Statement": "INSERT INTO addresses VALUE {'custId' : 'cust#329', 'addressId' : 'addr#1', 'street' : '444 Fourth St', 'city' : 'Some City', 'zip' : 22122}"
    },
    {
        "Statement": "DELETE FROM addresses WHERE custId = 'cust#897' AND addressId = 'addr#2' AND type = 'TEMPORARY'"
    },
    {
        "Statement": "UPDATE addresses SET zip = 34513 WHERE custId = 'cust#897' AND addressId = 'addr#1' AND zip = 34512"
    }
]

BatchExecuteStatement represents an entirely new batched operation experience for DynamoDB users. We hope you will be pleased to find that BatchExecuteStatement always returns a success response to the overall request, and also returns a list of individual operation responses that preserves order. You can use any individual error responses to determine the retry strategy.

Example Python code using PartiQL support

All the AWS SDKs support the new PartiQL API actions in DynamoDB. To show how to use these new SQL-compatible statements from an SDK, we refer to an example from the DynamoDB tutorial Create and Manage a Nonrelational Database. Step 3 of Module 2 in the tutorial demonstrates the following Python script that uses the AWS SDK (Boto3) to retrieve an item by using the GetItem API action:

import boto3
dynamodb = boto3.resource('dynamodb', region_name='us-east-1')
table = dynamodb.Table('Books')
resp = table.get_item(Key={"Author": "John Grisham", "Title": "The Rainmaker"})
print(resp['Item'])

You can achieve an equivalent result by using a PartiQL statement and the new ExecuteStatement API, as shown in the following code:

import boto3
dynamodb = boto3.client('dynamodb', region_name='us-east-1')
resp = dynamodb.execute_statement(Statement='SELECT * FROM Books WHERE Author = \'John Grisham\' AND Title = \'The Rainmaker\'')
print(resp['Items'])

An alternative approach uses parameters specified outside of the PartiQL statement itself. This might be simpler to develop around and also might provide an additional level of protection against a SQL injection exploit. Compare the following Python Boto3 code with the preceding example:

import boto3
dynamodb = boto3.client('dynamodb', region_name='us-east-1')
resp = dynamodb.execute_statement(Statement='SELECT * FROM Books WHERE Author = ? AND Title = ?', Parameters =[{'S': 'John Grisham'}, {'S': 'The Rainmaker'}])
print(resp['Items'])

Summary

In this post, we explained how you can use PartiQL in DynamoDB to describe your data operations using SQL-compatible statements. The PartiQL support in DynamoDB provides a familiar path so you can build faster, and also provides powerful new batched capabilities.

The DynamoDB API actions supporting PartiQL statements are available in 23 AWS Regions. As you test and build on these PartiQL statements via the AWS CLI and your preferred SDK, you can send us feedback on the @DynamoDB Twitter handle.

About the Authors

Pete Naylor started at AWS in a Technical Account Manager role, supporting Amazon.com as a customer, and focusing on their database modernization goals. Then he worked as a Specialist Solutions Architect for Amazon DynamoDB and helped many other customers to achieve their own database goals. Now working in the DynamoDB team as a Technical Product Manager, Pete enjoys working closely with the engineers to define and build new features.

Akshat Vig is a Principal Engineer at Amazon Web Services. Akshat has worked on multiple services including SimpleDB, DynamoDB, and Keyspaces at Amazon since 2010. Akshat is currently focused on DynamoDB. Akshat really enjoys solving distributed system problems at scale. Outside of work, Akshat spends time cycling long distances.

AWS Database Blog