How to Easily Apply Amazon Cloud Directory Schema Changes with In-Place Schema Upgrades
Now, Amazon Cloud Directory makes it easier for you to apply schema changes across your directories with in-place schema upgrades. Your directory now remains available while Cloud Directory applies backward-compatible schema changes such as the addition of new fields. Without migrating data between directories or applying code changes to your applications, you can upgrade your schemas. You also can view the history of your schema changes in Cloud Directory by using version identifiers, which help you track and audit schema versions across directories. If you have multiple instances of a directory with the same schema, you can view the version history of schema changes to manage your directory fleet and ensure that all directories are running with the same schema version.
In this blog post, I demonstrate how to perform an in-place schema upgrade and use schema versions in Cloud Directory. I add additional attributes to an existing facet and add a new facet to a schema. I then publish the new schema and apply it to running directories, upgrading the schema in place. I also show how to view the version history of a directory schema, which helps me to ensure my directory fleet is running the same version of the schema and has the correct history of schema changes applied to it.
Note: I share Java code examples in this post. I assume that you are familiar with the AWS SDK and can use Java-based code to build a Cloud Directory code example. You can apply the concepts I cover in this post to other programming languages such as Python and Ruby.
Cloud Directory fundamentals
I will start by covering a few Cloud Directory fundamentals. If you are already familiar with the concepts behind Cloud Directory facets, schemas, and schema lifecycles, you can skip to the next section.
Facets: Groups of attributes. You use facets to define object types. For example, you can define a device schema by adding facets such as computers, phones, and tablets. A computer facet can track attributes such as serial number, make, and model. You can then use the facets to create computer objects, phone objects, and tablet objects in the directory to which the schema applies.
Schemas: Collections of facets. Schemas define which types of objects can be created in a directory (such as users, devices, and organizations) and enforce validation of data for each object class. All data within a directory must conform to the applied schema. As a result, the schema definition is essentially a blueprint to construct a directory with an applied schema.
Schema lifecycle: The four distinct states of a schema:
Deleted. Schemas in the
Applied states have version identifiers and cannot be changed. Schemas in the
Applied state are used by directories for validation as applications insert or update data. You can change schemas in the
Development state as many times as you need them to. In-place schema upgrades allow you to apply schema changes to an existing
Applied schema in a production directory without the need to export and import the data populated in the directory.
How to add attributes to a computer inventory application schema and perform an in-place schema upgrade
To demonstrate how to set up schema versioning and perform an in-place schema upgrade, I will use an example of a computer inventory application that uses Cloud Directory to store relationship data. Let’s say that at my company, AnyCompany, we use this computer inventory application to track all computers we give to our employees for work use. I previously created a
ComputerSchema and assigned its version identifier as
1. This schema contains one facet called
ComputerInfo that includes attributes for
Model, as shown in the following schema details.
AnyCompany has offices in Seattle, Portland, and San Francisco. I have deployed the computer inventory application for each of these three locations. As shown in the lower left part of the following diagram,
ComputerSchema is in the
Published state with a version of
Published schema is applied to
SanFranciscoDirectory for AnyCompany’s three locations. Implementing separate directories for different geographic locations when you don’t have any queries that cross location boundaries is a good data partitioning strategy and gives your application better response times with lower latency.
The following code example creates the schema in the
Development state by using a JSON file, publishes the schema, and then creates directories for the Seattle, Portland, and San Francisco locations. For this example, I assume the schema has been defined in the JSON file. The
createSchema API creates a schema Amazon Resource Name (ARN) with the name defined in the variable,
SCHEMA_NAME. I can use the
putSchemaFromJson API to add specific schema definitions from the JSON file.
The following code example takes the schema that is currently in the
Development state and publishes the schema, changing its state to
The following code example creates a directory named
SeattleDirectory and applies the published schema. The
createDirectory API call creates a directory by using the published schema provided in the API parameters. Note that Cloud Directory stores a version of the schema in the directory in the
Applied state. I will use similar code to create directories for
Revising a schema
Now let’s say my company, AnyCompany, wants to add more information for computers and to track which employees have been assigned a computer for work use. I modify the schema to add two attributes to the
OSVersion (operating system version). I make
Description optional because it is not important for me to track this attribute for the computer objects I create. I make
OSVersion mandatory because it is critical for me to track it for all computer objects so that I can make changes such as applying security patches or making upgrades. Because I make
OSVersion mandatory, I must provide a default value that Cloud Directory will apply to objects that were created before the schema revision, in order to handle backward compatibility. Note that you can replace the value in any object with a different value.
I also add a new facet to track computer assignment information, shown in the following updated schema as the
ComputerAssignment facet. This facet tracks these additional attributes:
Name (the name of the person to whom the computer is assigned),
Department, and department
CostCenter. Note that Cloud Directory refers to the previously available version identifier as the
Major Version. Because I can now add a minor version to a schema, I also denote the changed schema as
Minor Version A.
The following diagram shows the changes that were made when I added another facet to the schema and attributes to the existing facet. The highlighted area of the diagram (bottom left) shows that the schema changes were published.
The following code example revises the existing
Development schema by adding the new attributes to the
ComputerInfo facet and by adding the
ComputerAssignment facet. I use a new JSON file for the schema revision, and for the purposes of this example, I am assuming the JSON file has the full schema including planned revisions.
Upgrading the Published schema
The following code example performs an in-place schema upgrade of the
Published schema with schema revisions (it adds new attributes to the existing facet and another facet to the schema). The
upgradePublishedSchema API upgrades the
Published schema with backward-compatible changes from the
Upgrading the Applied schema
The following diagram shows the in-place schema upgrade for the
SeattleDirectory directory. I am performing the schema upgrade so that I can reflect the new schemas in all three directories. As a reminder, I added new attributes to the
ComputerInfo facet and also added the
ComputerAssignment facet. After the schema and directory upgrade, I can create objects for the
ComputerAssignment facets in the
SeattleDirectory. Any objects that were created with the old facet definition for
ComputerInfo will now use the default values for any additional attributes defined in the new schema.
I use the following code example to perform an in-place upgrade of the
SeattleDirectory to a
Major Version of
1 and a
Minor Version of
A. Note that you should change a
Major Version identifier in a schema to make backward-incompatible changes such as changing the data type of an existing attribute or dropping a mandatory attribute from your schema. Backward-incompatible changes require directory data migration from a previous version to the new version. You should change a
Minor Version identifier in a schema to make backward-compatible upgrades such as adding additional attributes or adding facets, which in turn may contain one or more attributes. The
upgradeAppliedSchema API lets me upgrade an existing directory with a different version of a schema.
Note: Cloud Directory has excluded returning the
Minor Version identifier in the
Applied schema ARN for backward compatibility and to enable the application to work across older and newer versions of the directory.
The following diagram shows the changes that are made when I perform an in-place schema upgrade in the two remaining directories,
SanFranciscoDirectory. I make these calls sequentially, upgrading
PortlandDirectory first and then upgrading
SanFranciscoDirectory. I use the same code example that I used earlier to upgrade
SeattleDirectory. Now, all my directories are running the most current version of the schema. Also, I made these schema changes without having to migrate data and while maintaining my application’s high availability.
Schema revision history
I can now view the schema revision history for any of AnyCompany’s directories by using the
listAppliedSchemaArns API. Cloud Directory maintains the five most recent versions of applied schema changes. Similarly, to inspect the current
Minor Version that was applied to my schema, I use the
getAppliedSchemaVersion API. The
listAppliedSchemaArns API returns the schema ARNs based on my schema filter as defined in
I use the following code example to query an
Applied schema for its version history.
listAppliedSchemaArns API returns the two ARNs as shown in the following output.
The following code example queries an
Applied schema for current
Minor Version by using the
getAppliedSchemaVersion API returns the current
Applied schema ARN with a
Minor Version, as shown in the following output.
If you have a lot of directories, schema revision API calls can help you audit your directory fleet and ensure that all directories are running the same version of a schema. Such auditing can help you ensure high integrity of directories across your fleet.
You can use in-place schema upgrades to make changes to your directory schema as you evolve your data set to match the needs of your application. An in-place schema upgrade allows you to maintain high availability for your directory and applications while the upgrade takes place. For more information about in-place schema upgrades, see the in-place schema upgrade documentation.
If you have comments about this blog post, submit them in the “Comments” section below. If you have questions about implementing the solution in this post, start a new thread in the Directory Service forum or contact AWS Support.