Write and Read Multiple Objects in Amazon Cloud Directory by Using Batch Operations

Amazon Cloud Directory is a hierarchical data store that enables you to build flexible, cloud-native directories for organizing hierarchies of data along multiple dimensions. For example, you can create an organizational structure that you can navigate through multiple hierarchies for reporting structure, location, and cost center.

In this blog post, I demonstrate how you can use Cloud Directory APIs to write and read multiple objects by using batch operations. With batch write operations, you can execute a sequence of operations atomically—meaning that all of the write operations must occur, or none of them do. You also can make your application efficient by reducing the number of required round trips to read and write objects to your directory. I have used the AWS SDK for Java for all the sample code in this blog post, but you can use other language SDKs or the AWS CLI in a similar way.

Using batch write operations

To demonstrate batch write operations, let’s say that AnyCompany’s warehouses are organized to determine the fastest methods to ship orders to its customers. In North America, AnyCompany plans to open new warehouses regularly so that the company can keep up with customer demand while continuing to meet the delivery times to which they are committed.

The following diagram shows part of AnyCompany’s global network, including Asian and European warehouse networks.

Let’s take a look at how I can use batch write operations to add NorthAmerica to AnyCompany’s global network of warehouses, with the first three warehouses in New York City (NYC), Las Vegas (LAS), and Phoenix (PHX).

Adding NorthAmerica to the global network

To add NorthAmerica to the global network, I can use a batch write operation to create and link all the objects in the existing network.

First, I set up a helper method, which performs repetitive tasks, for the getBatchCreateOperation object. The following lines of code help me create an NA object for NorthAmerica and then attach the three city-related nodes: NYC, LAS, and PHX. Because AnyCompany is planning to grow its network, I add a suffix of _1 to each city code (such as PHX_1), which will be helpful hierarchically when the company adds more warehouses within a city.

    private BatchWriteOperation getBatchCreateOperation(
            String warehouseName,
            String directorySchemaARN,
            String parentReference,
            String linkName) {

        SchemaFacet warehouse_facet = new SchemaFacet()
            .withFacetName("warehouse")
            .withSchemaArn(directorySchemaARN);

        AttributeKeyAndValue kv = new AttributeKeyAndValue()
            .withKey(new AttributeKey()
                .withFacetName("warehouse")
                .withName("name")
                .withSchemaArn(directorySchemaARN))
            .withValue(new TypedAttributeValue()
                .withStringValue(warehouseName);

        List<SchemaFacet> facets = Lists.newArrayList(warehouse_facet);
        List<AttributeKeyAndValue> kvs = Lists.newArrayList(kv);

        BatchCreateObject createObject = new BatchCreateObject();

        createObject.withParentReference(new ObjectReference()
            .withSelector(parentReference));
        createObject.withLinkName(linkName);

        createObject.withBatchReferenceName(UUID.randomUUID().toString());
        createObject.withSchemaFacet(facets);
        createObject.withObjectAttributeList(kvs);

        return new BatchWriteOperation().withCreateObject
                                       (createObject);
    }

The parameters of this helper method include:

warehouseName – The name of the warehouse to create in the getBatchCreateOperation object.
directorySchemaARN – The Amazon Resource Name (ARN) of the schema applied to the directory.
parentReference – The object reference of the parent object.
linkName – The unique child path from the parent reference where the object should be attached.

I then use this helper method to set up multiple create operations for NorthAmerica, NewYork, Phoenix, and LasVegas. For the sake of simplicity, I use airport codes to stand for the cities (for example, NYC stands for NewYork).

   BatchWriteOperation createObjectNA = getBatchCreateOperation(
                      "NA",
                      directorySchemaARN,
                      "/",
                      "NorthAmerica");
   BatchWriteOperation createObjectNYC = getBatchCreateOperation(
                      "NYC_1",
                      directorySchemaARN,
                      "/NorthAmerica",
                      "NewYork");
   BatchWriteOperation createObjectPHX = getBatchCreateOperation(
                       "PHX_1",
                       directorySchemaARN,
                       "/NorthAmerica",
                       "Phoenix");
   BatchWriteOperation createObjectLAS = getBatchCreateOperation(
                      "LAS_1",
                      directorySchemaARN,
                      "/NorthAmerica",
                      "LasVegas");

   BatchWriteRequest request = new BatchWriteRequest();
   request.setDirectoryArn(directoryARN);
   request.setOperations(Lists.newArrayList(
       createObjectNA,
       createObjectNYC,
       createObjectPHX,
       createObjectLAS));

   client.batchWrite(request);

Running the preceding code results in a hierarchy for the network with NA added to the network, as shown in the following diagram.

Using batch read operations

Now, let’s say that after I add NorthAmerica to AnyCompany’s global network, an analyst wants to see the updated view of the NorthAmerica warehouse network as well as some information about the newly introduced warehouse configurations for the Phoenix warehouses. To do this, I can use batch read operations to get the network of warehouses for NorthAmerica as well as specifically request the attributes and configurations of the Phoenix warehouses.

To list the children of the NorthAmerica warehouses, I use the BatchListObjectChildren API to get all the children at the path, /NorthAmerica. Next, I want to view the attributes of the Phoenix object, so I use the BatchListObjectAttributes API to read all the attributes of the object at /NorthAmerica/Phoenix, as shown in the following code example.

    BatchListObjectChildren listObjectChildrenRequest = new BatchListObjectChildren()
        .withObjectReference(new ObjectReference().withSelector("/NorthAmerica"));
    BatchListObjectAttributes listObjectAttributesRequest = new BatchListObjectAttributes()
        .withObjectReference(new ObjectReference()
            .withSelector("/NorthAmerica/Phoenix"));
    BatchReadRequest batchRead = new BatchReadRequest()
        .withConsistencyLevel(ConsistencyLevel.EVENTUAL)
        .withDirectoryArn(directoryArn)
        .withOperations(Lists.newArrayList(listObjectChildrenRequest, listObjectAttributesRequest));

    BatchReadResult result = client.batchRead(batchRead);

Exception handling

Batch operations in Cloud Directory might sometimes fail, and it is important to know how to handle such failures, which differ for write operations and read operations.

Batch write operation failures

If a batch write operation fails, Cloud Directory fails the entire batch operation and returns an exception. The exception contains the index of the operation that failed along with the exception type and message. If you see RetryableConflictException, you can try again with exponential backoff. A simple way to do this is to double the amount of time you wait each time you get an exception or failure. For example, if your first batch write operation fails, wait 100 milliseconds and try the request again. If the second request fails, wait 200 milliseconds and try again. If the third request fails, wait 400 milliseconds and try again.

Batch read operation failures

If a batch read operation fails, the response contains either a successful response or an exception response. Individual batch read operation failures do not cause the entire batch read operation to fail—Cloud Directory returns individual success or failure responses for each operation.

Limits of batch operations

Batch operations are still constrained by the same Cloud Directory limits as other Cloud Directory APIs. A single batch operation does not limit the number of operations, but the total number of nodes or objects being written or edited in a single batch operation have enforced limits. For example, a total of 20 objects can be written in a single batch operation request to Cloud Directory, regardless of how many individual operations there are within that batch. Similarly, a total of 200 objects can be read in a single batch operation request to Cloud Directory. For more information, see limits on batch operations.

Summary

In this post, I have demonstrated how you can use batch operations to operate on multiple objects and simplify making complicated changes across hierarchies. In my next post, I will demonstrate how to use batch references within batch write operations. To learn more about batch operations, see Batches, BatchWrite, and BatchRead.

If you have comments about this post, submit them in the “Comments” section below. If you have implementation questions, start a new thread on the Directory Service forum.

– Vineeth