AWS Compute Blog

Automating event validation with Amazon EventBridge Schema Discovery

This post is written by Kurt Tometich, Senior Solutions Architect, and Giedrius Praspaliauskas, Senior Solutions Architect, Serverless

Event-driven architectures face challenges with event validation due to unique domains, varying event formats, frequencies, and governance levels. Events are constantly evolving, requiring a balanced approach between speed and governance. This blog post describes approaches to consumer and producer event validation, focusing on automated solutions for producer validation using Amazon EventBridge and Amazon API Gateway.

Consumer and Producer Event Validation

In an event-driven system, events should be validated by both producers and consumers to maintain data integrity. The producers’ job is to create and send valid events before they are routed to consumers. Failing to do so can lead to data inconsistencies, downstream errors in processing and unnecessary costs. As a consumer, even if events come from a trusted source, validation should still be applied. Producers may change data format over time, data may become corrupt, or interfaces between the producer and consumer may alter it.

A common way to manage and route events is through an event bus. EventBridge is a serverless event bus that can perform discovery, versioning and consumption of event schemas. When schema discovery is enabled on an event bus, new schema versions are generated when the event structure changes. These schemas can be used to perform validation on events.

The EventBridge Schema registry stores schemas in OpenAPI or JSONSchema formats. Schemas can be added to the registry automatically through schema discovery or by manually uploading your schema to the registry through the AWS console or programmatically. Schema discovery automates the process of finding schemas and adding them to your registry. Schemas for AWS events are automatically added to the registry.

Once a schema is added to the registry, you can generate a code binding for the schema. This allows you to represent the event as a strongly typed object in your code. Code bindings are available for Golang, Java, Python, or TypeScript programming languages. If preferred language-specific bindings are not available, schemas can be downloaded and validated using third-party schema validation libraries. For example, Ajv for JavaScript or the jsonschema library for Python.

If using code bindings, you can download them using the console, API, or within a supported IDE using the AWS Toolkit. Code bindings can be used like other code artifacts. If an AWS Lambda function is used as a consumer, add the code binding as a layer dependency. Bindings are not automatically synced to any artifact repositories, such as AWS CodeArtifact. The Lambda function code in this solution can be extended to automate binding uploads to your artifact repository.

The following diagram depicts a common producer (left) and consumer (right) event architecture on AWS. Producers send events through API Gateway or directly to an EventBridge event bus. It’s common to use API Gateway as a front door to provide authorization, validation and pre-processing of incoming events. Events going directly to EventBridge may also come from SaaS Partner Integrations (Salesforce, Jira, ServiceNow, etc.) or an application running in a private subnet using the AWS private network to connect to EventBridge. For these events, you can use third-party libraries to validate events prior to them arriving on EventBridge.

Image of Common Architecture for Producer and Consumer Event Validation.

Common Architecture for Producer and Consumer Event Validation

Workflow steps:

  1. Producers send events through API Gateway or directly to EventBridge. API Gateway provides request validation, parses and sends events to EventBridge if they pass validation. Invalid events that do not match the schema in API Gateway will be rejected before reaching EventBridge. Events going directly to EventBridge are validated using third party schema validation libraries (e.g. Ajv for JavaScript and jsonschema library for Python).
  2. With schema discovery enabled on a custom event bus, that bus will receive the event from an application and generate a new schema version in the registry. New schema versions are only created when the event structure changes. When new schema versions are created, a schema version created event is automatically emitted on the default EventBridge event bus. The default bus automatically receives AWS events. EventBridge rules can be configured to match all schema version changes or by filtering on schema name, type and other fields available on the event.
  3. Consumers define EventBridge rules to react to schema version change events. Consumers download the schema or code bindings from EventBridge and perform validation and parsing.
  4. Producers define EventBridge rules to react to schema version change events. The new schema is retrieved from the registry and either used in local development with third-party schema validation libraries, or a model in API Gateway is updated with the new schema directly. This step doesn’t exist as a native feature of EventBridge. The solution later in this post will demonstrate how to automate this step.

To scale this architecture to multiple event sources and API endpoints, you can create different models in API Gateway for each event schema. A model in API Gateway is a data schema that defines the structure and format of data for request and response payloads. Those models are then applied to different resources and methods defined on your APIs. The solutions below will demonstrate how event schemas can be automatically synced to models in API Gateway.

Solution Walkthrough

The following solutions use API Gateway to perform request validation and EventBridge schema discovery to automatically generate up-to-date schema versions. Both can be extended or modified to fit unique use cases. These solutions build upon the general producer and consumer validation architecture covered previously by incorporating automated solutions to downloading, processing and applying new schemas to API Gateway. Refer to the README.md file in the AWS Samples GitHub repository for pre-requisites, deployment instructions and testing.

Lambda Driven Schema Updater

The following architecture uses EventBridge schema discovery to generate new schema versions, download, process and post the schema to an API Gateway model for request validation. The Lambda schema updater function will trigger on schema version changes. The function trigger can be enabled or disabled by updating the rule in EventBridge console.

This solution is a good fit for quick updates with minimal processing. If complex testing and validation is required before updating a new schema, see the CI/CD driven schema updater solution covered later in this post. The rule in this solution triggers when a new schema version is added to the registry. To filter further, the rule can be modified or additional processing can be applied to the Lambda function. This provides flexibility in handling multiple domains or event types.

Image of Architecture for Lambda Driven Schema Updater.

Architecture for Lambda Driven Schema Updater

Workflow Steps:

  1. Producers send events to API Gateway endpoint or directly to EventBridge.
  2. API Gateway performs request validation on the body, modifies the event format and sends to EventBridge. If the event does not match the schema, API Gateway will reject the request.
  3. A custom event bus will receive the event and an optional rule based on source can log all events for tracking and troubleshooting.
  4. With schema discovery enabled on custom event bus, new event structures generate schema versions that are stored in the registry. If a new schema version is generated, consumers can download latest schema and code bindings from the registry.
  5. The schema version creation rule will invoke the Lambda function.
  6. The function will download, process and update the API Gateway model with the new schema. A new schema version is only generated if the structure of the event changes.

CI/CD Driven Schema Updater

The alternative approach uses a CI/CD pipeline to control schema changes. Instead of the Lambda function directly applying the new schema to the API Gateway model, it downloads, processes, and stores the schema in a repository. The CI/CD pipeline references the stored schema, performing additional tests and checks before the schema is promoted and enforced. This provides more control over the schema update process, though it introduces some additional complexity. The following diagram describes the CI/CD driven update process. The solution can be adapted to other artifact repositories and CI/CD systems.

Image of Architecture for CI/CD Driven Schema Updater

Architecture for CI/CD Driven Schema Updater

Workflow steps:

  1. Producers send events to API Gateway endpoint or directly to EventBridge.
  2. API Gateway will perform request validation against the body, modify the event format and send to EventBridge.
  3. A custom event bus will receive event and an optional rule based on source can log all events for tracking and troubleshooting.
  4. With discovery enabled on the custom event bus, schema versions are produced and stored in the registry.
  5. The schema version creation rule will invoke the Lambda function.
  6. The function will download, process and store the new schema in a repository of choice (i.e. S3, Git, Artifact Repository).
  7. The CI/CD pipeline updates the model in API Gateway and runs any necessary tests.
  8. The consumer downloads schema and code bindings from appropriate repositories.

Conclusion

Event validation can be challenging, but leveraging schema discovery and request validation minimizes custom logic and overhead. EventBridge can discover new schemas from events, while API Gateway validates incoming requests. This approach streamlines validation, improves data quality, and reduces the maintenance burden of manual validation.

For more information on event driven architectures, you can view additional resources on AWS Samples and Serverless Land.