AWS Security Blog
How to Easily Apply Amazon Cloud Directory Schema Changes with In-Place Schema Upgrades
Now, Amazon Cloud Directory makes it easier for you to apply schema changes across your directories with in-place schema upgrades. Your directory now remains available while Cloud Directory applies backward-compatible schema changes such as the addition of new fields. Without migrating data between directories or applying code changes to your applications, you can upgrade your schemas. You also can view the history of your schema changes in Cloud Directory by using version identifiers, which help you track and audit schema versions across directories. If you have multiple instances of a directory with the same schema, you can view the version history of schema changes to manage your directory fleet and ensure that all directories are running with the same schema version.
In this blog post, I demonstrate how to perform an in-place schema upgrade and use schema versions in Cloud Directory. I add additional attributes to an existing facet and add a new facet to a schema. I then publish the new schema and apply it to running directories, upgrading the schema in place. I also show how to view the version history of a directory schema, which helps me to ensure my directory fleet is running the same version of the schema and has the correct history of schema changes applied to it.
Note: I share Java code examples in this post. I assume that you are familiar with the AWS SDK and can use Java-based code to build a Cloud Directory code example. You can apply the concepts I cover in this post to other programming languages such as Python and Ruby.
Cloud Directory fundamentals
I will start by covering a few Cloud Directory fundamentals. If you are already familiar with the concepts behind Cloud Directory facets, schemas, and schema lifecycles, you can skip to the next section.
Facets: Groups of attributes. You use facets to define object types. For example, you can define a device schema by adding facets such as computers, phones, and tablets. A computer facet can track attributes such as serial number, make, and model. You can then use the facets to create computer objects, phone objects, and tablet objects in the directory to which the schema applies.
Schemas: Collections of facets. Schemas define which types of objects can be created in a directory (such as users, devices, and organizations) and enforce validation of data for each object class. All data within a directory must conform to the applied schema. As a result, the schema definition is essentially a blueprint to construct a directory with an applied schema.
Schema lifecycle: The four distinct states of a schema: Development
, Published
, Applied
, and Deleted
. Schemas in the Published
and Applied
states have version identifiers and cannot be changed. Schemas in the Applied
state are used by directories for validation as applications insert or update data. You can change schemas in the Development
state as many times as you need them to. In-place schema upgrades allow you to apply schema changes to an existing Applied
schema in a production directory without the need to export and import the data populated in the directory.
How to add attributes to a computer inventory application schema and perform an in-place schema upgrade
To demonstrate how to set up schema versioning and perform an in-place schema upgrade, I will use an example of a computer inventory application that uses Cloud Directory to store relationship data. Let’s say that at my company, AnyCompany, we use this computer inventory application to track all computers we give to our employees for work use. I previously created a ComputerSchema
and assigned its version identifier as 1
. This schema contains one facet called ComputerInfo
that includes attributes for SerialNumber
, Make
, and Model
, as shown in the following schema details.
AnyCompany has offices in Seattle, Portland, and San Francisco. I have deployed the computer inventory application for each of these three locations. As shown in the lower left part of the following diagram, ComputerSchema
is in the Published
state with a version of 1
. The Published
schema is applied to SeattleDirectory
, PortlandDirectory
, and SanFranciscoDirectory
for AnyCompany’s three locations. Implementing separate directories for different geographic locations when you don’t have any queries that cross location boundaries is a good data partitioning strategy and gives your application better response times with lower latency.
The following code example creates the schema in the Development
state by using a JSON file, publishes the schema, and then creates directories for the Seattle, Portland, and San Francisco locations. For this example, I assume the schema has been defined in the JSON file. The createSchema
API creates a schema Amazon Resource Name (ARN) with the name defined in the variable, SCHEMA_NAME
. I can use the putSchemaFromJson
API to add specific schema definitions from the JSON file.
The following code example takes the schema that is currently in the Development
state and publishes the schema, changing its state to Published
.
The following code example creates a directory named SeattleDirectory
and applies the published schema. The createDirectory
API call creates a directory by using the published schema provided in the API parameters. Note that Cloud Directory stores a version of the schema in the directory in the Applied
state. I will use similar code to create directories for PortlandDirectory
and SanFranciscoDirectory
.
Revising a schema
Now let’s say my company, AnyCompany, wants to add more information for computers and to track which employees have been assigned a computer for work use. I modify the schema to add two attributes to the ComputerInfo
facet: Description
and OSVersion
(operating system version). I make Description
optional because it is not important for me to track this attribute for the computer objects I create. I make OSVersion
mandatory because it is critical for me to track it for all computer objects so that I can make changes such as applying security patches or making upgrades. Because I make OSVersion
mandatory, I must provide a default value that Cloud Directory will apply to objects that were created before the schema revision, in order to handle backward compatibility. Note that you can replace the value in any object with a different value.
I also add a new facet to track computer assignment information, shown in the following updated schema as the ComputerAssignment
facet. This facet tracks these additional attributes: Name
(the name of the person to whom the computer is assigned), EMail
(the email address of the assignee), Department
, and department CostCenter
. Note that Cloud Directory refers to the previously available version identifier as the Major Version
. Because I can now add a minor version to a schema, I also denote the changed schema as Minor Version A
.
The following diagram shows the changes that were made when I added another facet to the schema and attributes to the existing facet. The highlighted area of the diagram (bottom left) shows that the schema changes were published.
The following code example revises the existing Development
schema by adding the new attributes to the ComputerInfo
facet and by adding the ComputerAssignment
facet. I use a new JSON file for the schema revision, and for the purposes of this example, I am assuming the JSON file has the full schema including planned revisions.
Upgrading the Published schema
The following code example performs an in-place schema upgrade of the Published
schema with schema revisions (it adds new attributes to the existing facet and another facet to the schema). The upgradePublishedSchema
API upgrades the Published
schema with backward-compatible changes from the Development
schema.
Upgrading the Applied schema
The following diagram shows the in-place schema upgrade for the SeattleDirectory
directory. I am performing the schema upgrade so that I can reflect the new schemas in all three directories. As a reminder, I added new attributes to the ComputerInfo
facet and also added the ComputerAssignment
facet. After the schema and directory upgrade, I can create objects for the ComputerInfo
and ComputerAssignment
facets in the SeattleDirectory
. Any objects that were created with the old facet definition for ComputerInfo
will now use the default values for any additional attributes defined in the new schema.
I use the following code example to perform an in-place upgrade of the SeattleDirectory
to a Major Version
of 1
and a Minor Version
of A
. Note that you should change a Major Version
identifier in a schema to make backward-incompatible changes such as changing the data type of an existing attribute or dropping a mandatory attribute from your schema. Backward-incompatible changes require directory data migration from a previous version to the new version. You should change a Minor Version
identifier in a schema to make backward-compatible upgrades such as adding additional attributes or adding facets, which in turn may contain one or more attributes. The upgradeAppliedSchema
API lets me upgrade an existing directory with a different version of a schema.
Note: Cloud Directory has excluded returning the Minor Version
identifier in the Applied
schema ARN for backward compatibility and to enable the application to work across older and newer versions of the directory.
The following diagram shows the changes that are made when I perform an in-place schema upgrade in the two remaining directories, PortlandDirectory
and SanFranciscoDirectory
. I make these calls sequentially, upgrading PortlandDirectory
first and then upgrading SanFranciscoDirectory
. I use the same code example that I used earlier to upgrade SeattleDirectory
. Now, all my directories are running the most current version of the schema. Also, I made these schema changes without having to migrate data and while maintaining my application’s high availability.
Schema revision history
I can now view the schema revision history for any of AnyCompany’s directories by using the listAppliedSchemaArns
API. Cloud Directory maintains the five most recent versions of applied schema changes. Similarly, to inspect the current Minor Version
that was applied to my schema, I use the getAppliedSchemaVersion
API. The listAppliedSchemaArns
API returns the schema ARNs based on my schema filter as defined in withSchemaArn
.
I use the following code example to query an Applied
schema for its version history.
The listAppliedSchemaArns
API returns the two ARNs as shown in the following output.
The following code example queries an Applied
schema for current Minor Version
by using the getAppliedSchemaVersion
API.
The getAppliedSchemaVersion
API returns the current Applied
schema ARN with a Minor Version
, as shown in the following output.
If you have a lot of directories, schema revision API calls can help you audit your directory fleet and ensure that all directories are running the same version of a schema. Such auditing can help you ensure high integrity of directories across your fleet.
Summary
You can use in-place schema upgrades to make changes to your directory schema as you evolve your data set to match the needs of your application. An in-place schema upgrade allows you to maintain high availability for your directory and applications while the upgrade takes place. For more information about in-place schema upgrades, see the in-place schema upgrade documentation.
If you have comments about this blog post, submit them in the “Comments” section below. If you have questions about implementing the solution in this post, start a new thread in the Directory Service forum or contact AWS Support.
– Mahendra