AWS AppSync Merged APIs Best Practices: Part 2 – Schema Composition

In the AWS AppSync Merged API – Best practices series, we cover important topics for developers, architects, and security engineers who are creating and managing AWS AppSync Merged and Source APIs. This multi-part series discusses
best practices on schema composition, deployment and testing, security and subscriptions for Merged APIs.

AWS AppSync offers Merged APIs, which allow multiple teams to operate independently, contribute to a unified primary GraphQL organizational data interface, and remove bottlenecks from existing manual processes when teams need to share development of a single AppSync API. There could be scenarios where you already have multiple graphs in your organization that you want to combine to provide a single view of data to your clients or there might be a situation where you have a graph monolith and want to split it into multiple sub-graphs for better management and efficiency. In both cases, designing the schema is a crucial step in GraphQL federation. A well-designed schema ensures that the API is intuitive, easy to use, and scalable. It allows developers to easily understand the structure of the data and how to access it.

In the first post of this Merged APIs best practices series, we walked through creation of cross-account Merged APIs. In this second post, we describe the best practices around schema design for Merged APIs, source APIs, collaboration and resolving conflicts between the API teams.

Designing a Merged Graph Schema

Designing the schema for a Merged API presents additional challenges requiring careful consideration. Whether you are separating a monolithic API into multiple sub-schemas or joining data across separate source schemas, the following are topics to consider:

Business use-cases that your client(s) aim to address with a single unified schema.
Determining team ownership for the Source APIs and collaboration among teams regarding type and field ownership.
Conflict resolution strategies between the teams.
Relationships between data in individual Source APIs.

1. Source APIs – Think of Types, Unique key(s), Teams, and Ownership

When designing the Source Graphs, think of each source graph as a microservice which handles a single domain, implemented by domain experts that can be developed independently. Source APIs encompass information about one or more related domains owned by a team. Each team owns and maintains the data and APIs based on Domain-driven design.

Domain-Driven Design (DDD) is an approach to software development that emphasizes understanding and modeling the domain of the problem being solved. Here, the domain refers to the area of knowledge or business domain in which the software is being developed.

All Source API types
Source APIs should be designed to be independent and self-contained, providing a comprehensive solution for a specific use-case. Taking the example shown above, let’s consider the Books API team. Their responsibility is to handle all data related to each book in the site’s catalog. To achieve this, the team manages a dedicated source AppSync API. This API empowers users to query book metadata, as well as perform actions such as adding, updating, and deleting books from the catalog. Similarly, the Authors, Reviews, and Users API teams are responsible for independently managing their respective areas of focus.

It is crucial that when a source API introduces changes to an existing feature, these modifications should be seamlessly consumed by customers without any dependency on the merged API. This ensures that the customers can continue using the updated functionality without disruption or reliance on other APIs.

Defining types and Primary Key(s)

When designing the Source APIs, it is important to define schema types using primary key(s) that uniquely identify them. These types should be owned primarily by one team based on team ownership. When designing the schema, we must ensure that each type has a well-known primary key field or fields that can be used to identify the type. This will also be helpful in establishing the relationships when it comes to another source API after the Merge. It is common to use an ID field to identify each type. Following the same example, for each Book type in the Books source API, id is the unique primary key to identify the Book type , similarly for Author, User and Reviews APIs as well.

Addition of new features

How does the API ecosystem handle the addition of new feature(s)? Let’s continue with the same example and explore the integration of the Recommendations feature into the ecosystem, enabling personalized book recommendations for users.

Recommendation Type

There are two approaches to consider: a) Integration into an existing Source API, or b) Creation of a new Source API. The decision regarding which approach to take depends on the team responsible for owning and managing the Recommendationsfeature. If the team in charge of one of the existing source APIs also owns the Recommendationsfeature, it is advisable to incorporate it into that particular source API. However, if the Recommendations feature is managed by a separate team, it should be designed as a new Source API.

Both approaches have their advantages, and it should be a two-way door decision, allowing for flexibility. For instance, if we initially design the feature as part of an existing Source API, we can easily separate it into multiple Source APIs or combine it with an existing one at a later stage.

2. Establish Relationships between Source APIs – Getting ready for the Merge

The Source API themselves can provide the data for individual data elements independently. However, the true power of the Merged API architecture is the ability to join data across different APIs. In this above example, we have many opportunities to join the data in order to provide a more useful experience for clients. For example, when querying the Merged API endpoint, we may want the ability to get the basic information about an author while also returning a paginated list of books in the catalog that the author wrote. Going even a step further, we might want to also retrieve the list of reviews for each book in the list.

When designing the Source APIs, it is important to agree on a primary key or a composite primary key between the Source APIs in order to join data across these APIs. We must ensure that each type we are joining has a well-known primary key field or fields that can be used to reference this relation in another source API. It is common to use an ID field to identify each type. Following the same example, for each Book type in the Books source API, we store an authorId field which can be used to join data about the author from the Authors source API. The authorIdacts as the key for retrieving data about an author.

Book Type

The relationships can be bi-directional, and there are two possible ways to implement them: either in the parent resolver or the child resolver. In the provided example, the Booktype acts as the parent type with a reference to the Authortype, which serves as the child. Alternatively, it is also possible to establish a reference from the child type (Author) to the parent type (Book). Both approaches can be effective, as long as the primary keys of the related types are properly defined and stored.
This relationship can take different forms: it can be a one-to-one relationship, as demonstrated in the example above with the Book type, or a one-to-many relationships, as illustrated in the example below with the Author type.

Refer to the AWS AppSync Merged APIs launch blog for more examples on defining relationships and querying across Source APIs using parent-child resolvers.

Documenting the relationships

Another best practice to follow is to document the relationships through comments. Comments can be added to the types or fields in an AppSync schema definition. As part of defining the related types and keys, ensure you have proper comments added, so that it is intuitive for the developers to write the resolver logic.
Book Type wIth Comments

3. Merged APIs – Think of Organization wide use cases

Merged APIs are specifically designed to address enterprise-wide use cases, encompassing scenarios where data needs to be accessed by multiple clients (such as mobile, web, and IoT) and where the data from various microservices can be combined within a unified graph of organization-wide data. While it may be tempting to perceive Merged APIs solely as a one-to-one mapping with a user interface or to focus on the data provided by source APIs, their scope extends far beyond these aspects.

Taking the above example, a Merged API that could power the backend of a book review and recommendation website similar to Goodreads. With a Merged API, clients can list all the books with author and review information in a single API call, list all books with 4-star reviews and above, list top 10 authors who have authored a 5-star review books etc.

Merged API

Let’s take a look at our example again. This is what a query would look like in a federated architecture. This query is listing all the books, along with their author and review information.

Merged API Query

Each of these parts highlighted in different color represent fields fulfilled by a separate service. These different chunks of the query represent a portion of the graph that one domain service is responsible for serving up. Then the merged API, binds these separate schemas or graphs into a single composed graph. Each service only provides the part of the schema it is responsible for. In the above query, the Book Source API provides the fields id, title, authorId, genre, publicationYear and publisherId, the Author Source API provides Author name, bio and nationality of the author, and Reviews Source API provides reviews and rating for the books. The author and review Source APIs need to know about the Book Idthat the Author and Review belong to combine these data into a single query.

Hiding the related Ids in Merged API

In order to query the related types, we added primary key(s) from the related type to the original types i.e., for each Book type in the Books source API, we added the authorId field which can be used to join data about the author from the Authors source API. The authorIdis only needed for resolving the data about an author, but need not be visible to the Merged API itself. The clients calling the Merged API do not have to know about the related primary key(s). We use @hidden directive to hide the field from the Merged API as shown below.

Book Type with hidden directive

Do I really need a Merged API?

Do I really need a Merged API if I already have multiple AppSync GraphQL APIs in my ecosystem? The answer to this question depends on several factors. To determine whether a Merged API is recommended, consider the following questions:

Is the data between the Source APIs related?
Does the client need to make calls to two or more Source APIs to fulfill a particular use case?
Is there a need for a centralized authorization mechanism to manage access across the Source APIs?

If the answer to any of these questions is “Yes,” it is highly recommended to create a Merged API. This ensures that a comprehensive use case can be effectively addressed. By leveraging a Merged API, organizations can achieve a unified and streamlined approach to data integration, facilitating seamless communication between multiple APIs and providing a centralized point of access for clients.

4. Resolving Conflicts

In a greenfield scenario where you are designing the Merged API schema and your source schema in parallel, there is more flexibility in designing the Source schema. In this case, you can avoid defining conflicting types, field types, and operations in the source schemas. Here, each source API can own the types and provide the appropriate types, fields and operations needed by the Merged API schema.

In large organizations with multiple existing source schemas that you have to merge, there is always a possibility of conflicting types, fields, and operations that exist. In case of conflicts, the best way to handle is through team collaboration. Teams should agree on resolving the conflicts by changing the source schemas to rename / redefine types and fields so that the merge operation is successful without any issues. In this case, the types between Source schemas that are shared should be owned collaboratively, with an agreement between all the teams.

AWS AppSync also offers several GraphQL directives that can be used to reduce or resolve conflicts across source APIs. Refer to AppSync documentationon directives and handling conflicts using Directives.

In case of usage of directives, this could be done as an interim step until all the schema conflicts are resolved between teams, i.e. directives can be used to resolve conflicts in the migration phase to Merged APIs. Once migration is complete, source API teams can collaborate to make appropriate changes to the conflicting types and fields based on the merged schema.

There is other practical usage for directives.

1. 1. Testing the Source APIs using Mocks: We used field keys to join the data across different source APIs in the previous section. These keys form the “interface” of the source API, enabling us to test the Source APIs independently without needing to setup a Merged API itself. In order to test the source API end to end, we can make use of the @hiddendirective and create mock resolvers to mock the key values. This way, when the Source API gets merged, this Mock resolver is not visible to the users of Merged APIs. For additional details refer to example from Merged APIs launch blog.
  2. Internal API as part of the Merged API: When you have a requirement to merge an Internal Source API and a customer facing public API, you can use @hidden directive to hide the relevant queries/mutations of the Internal API from the users.
  3. Authoritative source of data: When two of the Source APIs in a merged graph have the same types and fields, you can designate one of them as authoritative, I.e. override the other type using @canonical directive. This way, clients can always the correct data based on what you decide as the authoritative source of a specific data element is. Take an example as shown below where two source APIs are providing Review for a Book, but have a conflicting rating field.
    
    During the merge, you can designate one of the fields as @canonical to be the source of truth for the rating field. In this case, we designate the one which is of Float type as the field to be shown in the Merged API for the clients.
  4. Migration: Another use case where @canonical directive can be used is to avoid conflicts during the migration phase. During the migration phase, you might want to designate one of fields/types as @canonical in order to merge two source APIs successfully and then resolve the conflicts later working with the Owners of the Source APIs.
Collaboration between teams becomes crucial in this scenario due to the presence of shared types among the source APIs. It is essential to establish a Schema architect or a Schema working group, consisting of the Source API teams responsible for defining types and fields across the APIs. By focusing on the quality of the graph model, it becomes possible to minimize the need for future changes and reworking.

Conclusion

This is the second post in a five-part series on best practices on Design, Development, Testing and Deployment of AWS AppSync Merged APIs. In this post, we explained recommended practices around Source API and Merged API schema design, handling collaboration between teams, implementing relationships between Source API types and how to resolve conflict across teams.

To learn more about AWS AppSync Merged APIs, refer to the AWS AppSync documentation or visit the product page for more general information on AWS AppSync. We can’t wait to see what you will build!

About the author

Venugopalan Vasudevan is a Senior Specialist Solutions Architect focusing on AWS Front-end Web & Mobile services. Venu helps customers build their front-end and mobile strategies on AWS, including maturing and enhancing their DevOps practices.