Tag: DynamoDB


Introducing S3Link to DynamoDBContext

by Mason Schneider | on | in .NET | Permalink | Comments |  Share

S3Link has been in the AWS SDK for Java for a while now, and we have decided to introduce it to the AWS SDK for .NET as well. This feature allows you to access your Amazon S3 resources easily through a link in your Amazon DynamoDB data. S3Link can be used with minimal configuration with the .NET DynamoDB Object Persistence Model. To use S3Link, simply add it as a field to your DynamoDB annotated class and create a bucket in S3. The following Book class has an S3Link property named CoverImage.

// Create a class for DynamoDBContext
[DynamoDBTable("Library")]
public class Book
{
	[DynamoDBHashKey]   
	public int Id { get; set; }

	public S3Link CoverImage { get; set; }

	public string Title { get; set; }
	public int ISBN { get; set; }

	[DynamoDBProperty("Authors")]    
	public List BookAuthors { get; set; }
}

Now that we have an S3Link in our annotated class, we are ready to manage an S3 object. The following code does four things:

  1. Creates and saves a book to DynamoDB
  2. Uploads the cover of the book to S3
  3. Gets a pre-signed URL to the uploaded object
  4. Loads the book back in using the Context object and downloads the cover of the book to a local file
// Create a DynamoDBContext
var context = new DynamoDBContext();

// Create a book with an S3Link
Book myBook = new Book
{
	Id = 501,
	CoverImage = S3Link.Create(context, "myBucketName", "covers/AWSSDK.jpg", Amazon.RegionEndpoint.USWest2),
	Title = "AWS SDK for .NET Object Persistence Model Handling Arbitrary Data",
	ISBN = 999,
	BookAuthors = new List { "Jim", "Steve", "Pavel", "Norm", "Milind" }
};

// Save book to DynamoDB
context.Save(myBook);

// Use S3Link to upload the content to S3
myBook.CoverImage.UploadFrom("path/to/covers/AWSSDK.jpg");

// Get a pre-signed URL for the image
string coverURL = myBook.CoverImage.GetPreSignedURL(DateTime.Now.AddHours(5));

// Load book from DynamoDB
myBook = context.Load(501);

// Download file linked from S3Link
myBook.CoverImage.DownloadTo("path/to/save/cover/otherbook.jpg");

And that’s the general use for S3Link. Simply provide it a bucket and a key, and then you can upload and download your data.

Using Improved Conditional Writes in DynamoDB

by David Yanacek | on | in Java | Permalink | Comments |  Share

Last month the Amazon DynamoDB team announced a new pair of features: Improved Query Filtering and Conditional Updates.  In this post, we’ll show how to use the new and improved conditional writes feature of DynamoDB to speed up your app.

Let’s say you’re building a racing game, where two players advance in position until they reach the finish line.  To manage the state in DynamoDB, each game could be stored in its own Item in DynamoDB, in a Game table with GameId as the primary key, and each player position stored in a different attribute.  Here’s an example of what a Game item could look like:

    {
        "GameId": "abc",
        "Status": "IN_PROGRESS",
        "Player1-Position": 0,
        "Player2-Position": 0
    }

To make players move, you can use the atomic counters feature of DynamoDB in the UpdateItem API to send requests like, “increase the player position by 1, regardless of its current value”.  To prevent players from advancing before the game starts, you can use conditional writes to make the same request as before, but only “as long as the game status is IN_PROGRESS.”  Conditional writes are a way of instructing DynamoDB to perform a given write request only if certain attribute values in the item match what you expect them to be at the time of the request.

But this isn’t the whole story.  How do you determine the winner of the game, and prevent players from moving once the game is over?  In other words, we need a way to atomically make it so that all players stop once one reaches the end of the race (no ties allowed!).

This is where the new improved conditional writes come in handy.  Before, the conditional writes feature supported tests for equality (attribute “x” equals “20”).  With improved conditions, DynamoDB supports tests for inequality (attribute “x” is less than “20”).  This is useful for the game application, because now the request can be, “increase the player position by 1 as long as the status of the game equals IN_PROGRESS, and the positions of player 1 and player 2 are less than 20.”  During player movement, one player will eventually reach the finish line first, and any future moves after that will be blocked by the conditional writes.  Here’s the code:


    public static void main(String[] args) {

        // To run this example, first initialize the client, and create a table
        // named 'Game' with a primary key of type hash / string called 'GameId'.
        
        AmazonDynamoDB dynamodb; // initialize the client
        
        try {
            // First set up the example by inserting a new item
            
            // To see different results, change either player's
            // starting positions to 20, or set player 1's location to 19.
            Integer player1Position = 15;
            Integer player2Position = 12;
            dynamodb.putItem(new PutItemRequest()
                    .withTableName("Game")
                    .addItemEntry("GameId", new AttributeValue("abc"))
                    .addItemEntry("Player1-Position",
                        new AttributeValue().withN(player1Position.toString()))
                    .addItemEntry("Player2-Position",
                        new AttributeValue().withN(player2Position.toString()))
                    .addItemEntry("Status", new AttributeValue("IN_PROGRESS")));
            
            // Now move Player1 for game "abc" by 1,
            // as long as neither player has reached "20".
            UpdateItemResult result = dynamodb.updateItem(new UpdateItemRequest()
                .withTableName("Game")
                .withReturnValues(ReturnValue.ALL_NEW)
                .addKeyEntry("GameId", new AttributeValue("abc"))
                .addAttributeUpdatesEntry(
                     "Player1-Position", new AttributeValueUpdate()
                         .withValue(new AttributeValue().withN("1"))
                         .withAction(AttributeAction.ADD))
                .addExpectedEntry(
                     "Player1-Position", new ExpectedAttributeValue()
                         .withValue(new AttributeValue().withN("20"))
                         .withComparisonOperator(ComparisonOperator.LT))
                .addExpectedEntry(
                     "Player2-Position", new ExpectedAttributeValue()
                         .withValue(new AttributeValue().withN("20"))
                         .withComparisonOperator(ComparisonOperator.LT))
                .addExpectedEntry(
                     "Status", new ExpectedAttributeValue()
                         .withValue(new AttributeValue().withS("IN_PROGRESS"))
                         .withComparisonOperator(ComparisonOperator.EQ))
     
            );
            if ("20".equals(result.getAttributes().get("Player1-Position").getN())) {
                System.out.println("Player 1 wins!");
            } else {
                System.out.println("The game is still in progress: "
                    + result.getAttributes());
            }
        } catch (ConditionalCheckFailedException e) {
            System.out.println("Failed to move player 1 because the game is over");
        }
    }

With this algorithm, player movement now takes only one write operation to DynamoDB.  What would it have taken without improved conditions?  Using only equality conditions, the app would have needed to follow the read-modify-write pattern:

  1. Read each item, making note of each player’s position, and verify that neither player already reached the end of the race.
  2. Advance the player’s position by 1, with a condition that both players were still in the position we read in step 1).

Notice that this algorithm requires two round-trips to DynamoDB, whereas with improved conditions, it can be done in only one round-trip.  This reduces both latency and cost.

You can find more information about conditional writes in Amazon DynamoDB in the Developer Guide.

Performing Conditional Writes Using the Amazon DynamoDB Transaction Library

by Wade Matveyenko | on | in Java | Permalink | Comments |  Share

Today we’re lucky to have another guest post by David Yanacek from the Amazon DynamoDB team. David is sharing his deep knowledge on the Amazon DynamoDB Transactions library to help explain how to use it with the conditional writes feature of Amazon DynamoDB.


The DynamoDB transaction library provides a convenient way to perform atomic reads and writes across multiple DynamoDB items and tables. The library does all of the nuanced item locking, commits, applies, and rollbacks for you, so that you don’t have to worry about building your own state machines or other schemes to make sure that writes eventually happen across multiple items. In this post, we demonstrate how to use the read-modify-write pattern with the transaction library to accomplish the same atomic checks you were used to getting by using conditional writes with the vanilla DynamoDB API.

The transaction library exposes as much of the low-level Java API as possible, but it does not support conditional writes out of the box. Conditional writes are a way of asking DynamoDB to perform a write operation like PutItem, UpdateItem, or DeleteItem, but only if certain attributes of the item still have the values that you expect, right before the write goes through. Instead of exposing conditional writes directly, the transaction library enables the read-modify-write pattern—just like the pattern you’re used to with transactions in an RDBMS. The idea is to start a transaction, read items using that transaction, validate that those items contain the values you expect to start with, write your changes using that same transaction, and then commit the transaction.  If the commit() call succeeds, it means that the changes were written atomically, and none of the items in the transaction were modified by any other transaction in the meantime, starting from the time when each item was read by your transaction.

Transaction library recap

Let’s say you’re implementing a tic-tac-toe game. You have an Item in a DynamoDB table representing a single match of the game, with an attribute for each position in the board (Top-Left, Bottom-Right, etc.). Also, to make this into a multi-item transaction, let’s add two more items—one per player in the game, each with an attribute saying whether it is currently that player’s turn or not. The items might look something like this:

Games Table Item Users Table Items
{
  " GameId": "cf3df",
  "Turn": "Bob",
  "Top-Right": "O"
}
{
  " UserId": "Alice",
  "IsMyTurn": 0
}
{
  " UserId": "Bob",
  "IsMyTurn": 1
}

Now when Bob plays his turn in the game, all three items need to be updated:

  1. The Bob record needs to be marked as "Not my turn anymore."
  2. The Alice record needs to be marked as "It’s my turn now."
  3. The Game record needs to be marked as "It’s Alice’s turn, and also the Top-Left has an X in it."

If you write your application so that it performs three UpdateItem operations in a row, a few problems could occur. For example, your application could crash after doing one of the writes, and now something else in your application would need to notice this and pick up where it left off before doing anything else in the game. Fortunately, the transaction library can make these three separate operations happen together in a transaction, where either all of the writes go through together, or if there is another transaction overlapping with yours at the same time, only one of those transactions happens.

The code for doing this in a transaction looks like this:

// Start a new transaction
Transaction t = txManager.newTransaction();
 
// Update Alice's record to let him know that it is now her turn.
t.updateItem(
  new UpdateItemRequest()
    .withTableName("Users")
    .addKeyEntry("UserId", new AttributeValue("Alice"))
    .addAttributeUpdatesEntry("IsMyTurn",
            new AttributeValueUpdate(new AttributeValue("1"), AttributeAction.PUT)));
 
// Update Bob's record to let him know that it is not his turn anymore.
t.updateItem(
  new UpdateItemRequest()
    .withTableName("Users")
    .addKeyEntry("UserId", new AttributeValue("Bob"))
    .addAttributeUpdatesEntry("IsMyTurn",
            new AttributeValueUpdate(new AttributeValue("0"), AttributeAction.PUT)));
 
// Update the Game item to mark the spot that was played, and make it Alice's turn now.
t.updateItem(
  new UpdateItemRequest()
    .withTableName("Games")
    .addKeyEntry("GameId", new AttributeValue("cf3df"))
    .addAttributeUpdatesEntry("Top-Left", 
            new AttributeValueUpdate(new AttributeValue("X"), AttributeAction.PUT))
    .addAttributeUpdatesEntry("Turn",
            new AttributeValueUpdate(new AttributeValue("Alice"), AttributeAction.PUT)));
 
// If no exceptions are thrown by this line, it means that the transaction was committed.
t.commit();

What about conditional writes?

The preceding code makes sure that the writes go through atomically, but that’s not enough logic for making a move in the game. We need to make sure that, when the transaction goes through, there wasn’t a transaction right before it where Bob already played his turn. In other words, how do we make sure that Bob doesn’t play twice in a row—for example, by trying to sneak in two turns before Alice has a chance to move? If there was only a single item involved, say the "Games" item, we could accomplish this by using conditional writes (the Expected clause), like so:

// An example of a conditional update using the DynamoDB client (not the transaction library)
dynamodb.updateItem(
  new UpdateItemRequest()
    .withTableName("Games")
    .addKeyEntry("GameId", new AttributeValue("cf3df"))
    .addAttributeUpdatesEntry("Top-Left", 
    		new AttributeValueUpdate(new AttributeValue("X"), AttributeAction.PUT))
    .addAttributeUpdatesEntry("Turn",
    		new AttributeValueUpdate(new AttributeValue("Alice"), AttributeAction.PUT))
    .addExpectedEntry("Turn", new ExpectedAttributeValue(new AttributeValue("Bob"))) // A condition to ensure it's still Bob's turn
    .addExpectedEntry("Top-Left", new ExpectedAttributeValue(false)));               // A condition to ensure the Top-Left hasn't been played

This code now correctly updates the single Game item. However, conditional writes in DynamoDB can only refer to the single item the operation is updating, and our transaction contains three items that need to be updated together, only if the Game is still in the right state. Therefore, we need some way of mixing the original transaction code with these “conditional check” semantics.

Conditional writes with the transaction library

We started off with code for a transaction that coordinated the writes to all three items atomically, but it didn’t ensure that it was still Bob’s turn when it played Bob’s move. Fortunately, adding that check is easy: it’s simply a matter of adding a read to the transaction, and then performing the verification on the client-side. This is sometimes referred to as a "read-modify-write" pattern:

// Start a new transaction, just like before.
Transaction t = txManager.newTransaction();
 
// First, read the Game item.
Map game = t.getItem(
    new GetItemRequest()
        .withTableName("Games")
        .addKeyEntry("GameId", new AttributeValue("cf3df"))).getItem();
 
// Now check the Game item to ensure it's in the state you expect, and bail out if it's not.
// These checks serve as the "expected" clause.  
if (! "Bob".equals(game.get("Turn").getS())) {
    t.rollback();
    throw new ConditionalCheckFailedException("Bob can only play when it's Bob's turn!");
}
 
if (game.containsKey("Top-Left")) {
    t.rollback();
    throw new ConditionalCheckFailedException("Bob cannot play in the Top-Left because it has already been played.");
}
 
// Again, update Alice's record to let her know that it is now her turn.
t.updateItem(
    new UpdateItemRequest()
        .withTableName("Users")
        .addKeyEntry("UserId", new AttributeValue("Alice"))
        .addAttributeUpdatesEntry("IsMyTurn",
            new AttributeValueUpdate(new AttributeValue("1"), AttributeAction.PUT)));
 
// And again, update Bob's record to let him know that it is not his turn anymore.
t.updateItem(
    new UpdateItemRequest()
        .withTableName("Users")
        .addKeyEntry("UserId", new AttributeValue("Bob"))
        .addAttributeUpdatesEntry("IsMyTurn",
            new AttributeValueUpdate(new AttributeValue("0"), AttributeAction.PUT)));
 
// Finally, update the Game item to mark the spot that was played and make it Alice's turn now.
t.updateItem(
    new UpdateItemRequest()
        .withTableName("Games")
        .addKeyEntry("GameId", new AttributeValue("cf3df"))
        .addAttributeUpdatesEntry("Top-Left", 
            new AttributeValueUpdate(new AttributeValue("X"), AttributeAction.PUT))
        .addAttributeUpdatesEntry("Turn",
            new AttributeValueUpdate(new AttributeValue("Alice"), AttributeAction.PUT)));
 
// If no exceptions are thrown by this line, it means that the transaction was committed without interference from any other transactions.
try {
    t.commit();
} catch (TransactionRolledBackException e) {
    // If any of the items in the transaction were changed or read in the meantime by a different transaction, then this will be thrown.
    throw new RuntimeException("The game was changed while this transaction was happening. You probably want to refresh Bob's view of the game.", e);
}

There are two main differences with the first approach.

  • First, the code calls GetItem on the transaction and checks to make sure the item is in the state your application expects it to be in. If not, it rolls back the transaction and returns an error to the caller. This is done in the same transaction as the subsequent updates. When you read an item in a transaction, the transaction library locks the item in the same way as when you modify it in the transaction. Your application can still read an item without interfering with it while it is locked, but it must do so outside of a transaction, using one of the read isolation levels on the TransactionManager. More about read isolation levels is available in the design document for the transaction library.
  • Next, the code checks for TransactionRolledBackException. This check could have been done in the first example as well, but it’s called out in this example to show what will happen if another transaction either reads or writes any of the items involved in the transaction while yours was going on. When this happens, you might want to retry the whole transaction (start from the beginning—don’t skip any steps), or refresh your client’s view so that they can re-evaluate their move, since the state of the game may have changed.

While the preceding code doesn’t literally use the conditional writes API in DynamoDB (through the Expected parameter), it functionally does the same atomic validation—except with the added capability of performing that check and write atomically across multiple items.

More info

You can find the DynamoDB transaction library in the AWS Labs repository on GitHub. You’ll also find a more detailed write-up describing the algorithms it uses. You can find more usage information about the transaction library in the blog post that announced the library. And if you want to see some working code that uses transactions, check out TransactionExamples.java in the same repo.

For a recap on conditional writes, see part of a talk called Amazon DynamoDB Design Patterns for Ultra-High Performance Apps from the 2013 AWS re: Invent conference. You may find the rest of the talk useful as well, but the segment on conditional writes is only five minutes long.

Using New Regions and Endpoints

by Jeremy Lindblom | on | in PHP | Permalink | Comments |  Share

Last week, a customer asked us how they could configure the AWS SDK for PHP to use Amazon SES with the EU (Ireland) Region. SES had just released support for the EU Region, but there was no tagged version of the SDK that supported it yet.

Our typical process is to push new support for regions to the master branch of the AWS SDK for PHP repository as soon as possible after they are announced. In fact, at the time that the customer asked us about EU Region support in SES, we had already pushed out support for it. However, if you are using only tagged versions of the SDK, which you should do with production code, then you may have to wait 1 or 2 weeks until a new version of the SDK is released.

Configuring the base URL of your client

Fortunately, there is a way to use new regions and endpoints, even if the SDK does not yet support a new region for a service. You can manually configure the base_url of a client when you instantiate it. For example, to configure an SES client to use the EU Region, do the following:

$ses = AwsSesSesClient::factory(array(
    'key'      => 'YOUR_AWS_ACCESS_KEY_ID',
    'secret'   => 'YOUR_AWS_SECRET_KEY',
    'region'   => 'eu-west-1',
    'base_url' => 'https://email.eu-west-1.amazonaws.com',
));

Remember, you only need to specify the base_url if the SDK doesn’t already support the region. For regions that the SDK does support, the endpoint is automatically determined.

To find the correct URL to use for your desired service and region, see the Regions and Endpoints page of the AWS General Reference documentation.

Using the base_url for other reasons

The base_url option can be used for more than just accessing new regions. It can be used to allow the SDK to send requests to any endpoint compatible with the API of the service you are using (e.g., mock/test services, private beta endpoints).

An example of this is the DynamoDB Local tool that acts as a small client-side database and server that mimics Amazon DynamoDB. You can easily configure a DynamoDB client to work with DynamoDB Local by using the base_url option (assuming you have correctly installed and started DynamoDB Local).

$dynamodb = AwsDynamoDbDynamoDbClient::factory(array(
    'key'      => 'YOUR_AWS_ACCESS_KEY_ID',
    'secret'   => 'YOUR_AWS_SECRET_KEY',
    'region'   => 'us-east-1',
    'base_url' => 'http://localhost:8000',
));

For more information, see Setting a custom endpoint in the AWS SDK for PHP User Guide.

Using the latest SDK via Composer

If you are using Composer with the SDK, then you have another option for picking up new features, like newly supported regions, without modifying your code. If you need to use a new feature or bugfix that is not yet in a tagged release, you can do so by adjusting the SDK dependency in your composer.json file to use our development alias 2.5.x-dev.

{
    "require": {
        "aws/aws-sdk-php": "2.5.x-dev"
    }
}

Using the development alias, instead of dev-master, is ideal, because if you have other dependencies that require the SDK, version constraints like "2.5.*" will still resolve correctly. Remember that relying on a non-tagged version of the SDK is not recommended for production code.

Configuring DynamoDB Tables for Development and Production

by Norm Johanson | on | in .NET | Permalink | Comments |  Share

The Object Persistence Model API in the SDK uses annotated classes to tell the SDK which table to store objects in. For example, the DyanmoDBTable attribute on the Users class below tells the SDK to store instances of the Users class into the "Users" table.

[DynamoDBTable("Users")]
public class Users
{
    [DynamoDBHashKey]
    public string Id { get; set; }

    public string FirstName { get; set; }

    public string LastName { get; set; }
	
    ...
}

A common scenario is to have a different set of tables for production and development. To handle this scenario, the SDK supports setting a prefix in the application’s app.config file with the AWS.DynamoDBContext.TableNamePrefix app setting. This app.config file indicates that all the tables used by the Object Persistence Model should have the "Dev_" prefix.

<appSettings>
  ...
  <add key="AWSRegion" value="us-west-2" />
  <add key="AWS.DynamoDBContext.TableNamePrefix" value="Dev_"/>
  ...
</appSettings>

The prefix can also be modified at run time by setting either the global property AWSConfigs.DynamoDBContextTableNamePrefix or the TableNamePrefix property for the DynamoDBContextConfig used to store the objects.

 

DynamoDB Session Store for Rack Applications

by Loren Segal | on | in Ruby | Permalink | Comments |  Share

Today we are announcing a new RubyGem that enables your Ruby on Rails or Rack-based applications to store session data inside of Amazon DynamoDB. The gem acts as a drop-in replacement for session stores inside of Rails and can also run as a Rack middleware for non-Rails apps. You can read more about how to install and configure the gem on the GitHub repository: aws/aws-sessionstore-dynamodb-ruby. If you want to get started right away, just add the gem to your Gemfile via:

gem 'aws-sessionstore-dynamodb', '~> 1.0'

For me, the best part of this gem is that it was the product of a summer internship project by one of our interns, Ruby Robinson. She did a great job ramping up on new skills and technologies, and ultimately managed to produce some super well-tested and idiomatic code in a very short period of time. Here’s Ruby in her own words:

Hello, my name is Ruby Robinson, and I was a summer intern with the AWS Ruby SDK team. My project was to create a RubyGem (aws-sessionstore-dynamodb) that allowed Rack applications to store sessions in Amazon DynamoDB.

I came into the internship knowing Java and, ironically, not knowing Ruby. It was an opportunity to learn something new and contribute to the community. After pouring myself through a series of books, tutorials, and blogs on Ruby, Ruby on Rails, and Rack, the gem emerged; with help from Loren and Trevor.

Along with creating the gem, I got to experience the Amazon engineering culture. It largely involves taking ownership of projects, innovation, and scalability. I got to meet with engineers who were solving problems at scales I had only heard of. With an Amazon internship, you are not told what to do; you are asked what you are going to do. As my technical knowledge grew, I was able to take ownership of my project and drive it to completion.

In the end I produced a gem with some cool features! The gem is a drop-in replacement for the default session store that gives you the persistence and scale of Amazon DynamoDB. So, what are you waiting for? Check out the gem today!

The experience of bringing a developer from another language into Ruby taught me quite a bit about all of the great things that our ecosystem provides us, and also shined a light on some of the things that are more confusing to newbies. In the end, it was extremely rewarding to watch someone become productive in the language in such a short period of time. I would recommend that everyone take the opportunity to teach new Rubyists the language, if that opportunity ever arises. I think it’s also important that we encourage new developers to become active in the community and write more open source code, since that’s what makes our ecosystem so strong. So, if you know of a new Rubyist in your area, invite them out to your local Ruby meetup or hackfest and encourage them to contribute to some of the projects. You never know, in a few years these might be the people writing and maintaining the library code you depend on every day.

And with that said, please check out our new Amazon DynamoDB Session Store for Rack applications and let us know what you think, either here, or on GitHub!

Iterating through Amazon DynamoDB Results

by Jeremy Lindblom | on | in PHP | Permalink | Comments |  Share

The AWS SDK for PHP has a feature called "iterators" that allows you to retrieve an entire result set without manually handling pagination tokens or markers. The iterators in the SDK implement PHP’s Iterator interface, which allows you to easily enumerate or iterate through resources from a result set with foreach.

The Amazon DynamoDB client has iterators available for all of the operations that return sets of resoures, including Query, Scan, BatchGetItem, and ListTables. Let’s take a look at how we can use the iterators feature with the DynamoDB client in order to iterate through items in a result.

Specifically, let’s look at an example of how to create and use a Scan iterator. First, let’s create a client object to use throughout the rest of the example code.

<?php

require 'vendor/autoload.php';

use AwsDynamoDbDynamoDbClient;

$client = DynamoDbClient::factory(array(
    'key'    => '[aws access key]',
    'secret' => '[aws secret key]',
    'region' => '[aws region]' // (e.g., us-west-2)
));

Next, we’ll create a normal Scan operation without an iterator. A DynamoDB Scan operation is used to do a full table scan on a DynamoDB table. We want to iterate through all the items in the table, so we will just provide the TableName as a parameter to the operation without a ScanFilter.

$result = $client->scan(array(
    'TableName' => 'TheNameOfYourTable',
));

foreach ($result['Items'] as $item) {
    // Do something with the $item
}

The $result variable will contain a GuzzleServiceResourceModel object, which is an array-like object structured according to the description in the API documentation for the scan method. However, DynamoDB will only return up to 1 MB of results per Scan operation, so if your table is larger than 1 MB and you want to retrieve the entire result set, you will need to perform subsequent Scan operations that include the ExclusiveStartKey parameter. The following example shows how to do this:

$startKey = array();

do {
    $args = array('TableName' => 'TheNameOfYourTable') + $startKey;
    $result = $client->scan($args);

    foreach ($result['Items'] as $item) {
        // Do something with the $item
    }

    $startKey['ExclusiveStartKey'] = $result['LastEvaluatedKey'];
} while ($startKey['ExclusiveStartKey']);

Using an iterator to perform the Scan operation makes this much simpler.

$iterator = $client->getScanIterator(array(
    'TableName' => 'TheNameOfYourTable'
));

foreach ($iterator as $item) {
    // Do something with the $item
}

Using the iterator allows you to get the full result set, regardless of how many MB of data there are, and still be able to use a simple syntax to iterate through the results. The actual object returned by getScanIterator(), or any get*Iterator() method, is an instance of the AwsCommonIteratorAwsResourceIterator class.

Warning: Doing a full table scan on a large table may consume a lot of provisioned throughput and, depending on the table’s size and throughput settings, can take time to complete. Please be cautious before running the examples from this post on your own tables.

Iterators also allow you to put a limit on the maximum number of items you want to iterate through.

$iterator = $client->getScanIterator(array(
    'TableName' => 'TheNameOfYourTable'
), array(
    'limit' => 20
));

$count = 0;
foreach ($iterator as $item) {
    $count++;
}
echo $count;
#> 20

Now that you know how iterators work, let’s work through another example. Let’s say you have a DynamoDB table named "Contacts" with the following simple schema:

  • Id (Number)
  • FirstName (String)
  • LastName (String)

You can display the full name of each contact with the following code:

$contacts = $client->getScanIterator(array(
    'TableName' => 'Contacts'
));

foreach ($contacts as $contact) {
    $firstName = $contact['FirstName']['S'];
    $lastName = $contact['LastName']['S'];
    echo "{$firstName} {$lastName}n";
}

Item attribute values in your DynamoDB result are keyed by both the attribute name and attribute type. In many cases, especially when using a loosely typed language like PHP, the type of the item attribute may not be important, and a simple associative array might be more convenient. The SDK (as of version 2.4.1) includes the AwsDynamoDbIteratorItemIterator class which you can use to decorate a Scan, Query, or BatchGetItem iterator object in order to enumerate the items without the type information.

use AwsDynamoDbIteratorItemIterator;

$contacts = new ItemIterator($client->getScanIterator(array(
    'TableName' => 'Contacts'
)));

foreach ($contacts as $contact) {
    echo "{$contact['FirstName']} {$contact['LastName']}n";
}

The ItemIterator also has two more features that can be useful for certain schemas.

  1. If you have attributes of the binary (B) or binary set (BS) type, the ItemIterator will automatically apply base64_decode() to the values for you.
  2. The item will actually be enumerated as a GuzzleCommonCollection object. A Collection behaves like an array (i.e., it implements the ArrayAccess interface) and has some additional convenience methods. Additionally, it returns null instead of triggering notices for undefined indices. This is useful for working with items, since the NoSQL nature of DynamoDB does not restrict you to following a fixed schema with all of your items.

We hope that using iterators makes working with the AWS SDK for PHP easier and reduces the amount of code you have to write. You can use the ItemIterator class to get even easier access to the data in your Amazon DynamoDB tables.