AWS Open Source Blog
Leveraging Open Source at Barclays to Enable Lambda Event Filtering with AWS Glue Schema Registry
Barclays is a British, multinational universal bank. Its businesses include consumer banking and a top-tier global investment and corporate bank that deploys finance responsibly to support people and their businesses.
Engineering teams at Barclays strive to provide a best-in-class user experience to their customers, irrespective of their devices, platforms, or channels for different products. To meet changing customer expectations and business requirements, Barclays has partnered with AWS and adopted cloud native, event-driven systems that scale to millions of transactions per day. Barclays is leveraging modern serverless technologies like Amazon DynamoDB, AWS Lambda, Amazon Kinesis Data Streams, and others to build the solution from the ground-up.
When operating at such a large scale, in the event driven system, messages pass through numerous internal and third-party systems, which increases the risk of data corruption and loss. As a result, ensuring message integrity becomes crucial, and implementing a schema to validate incoming and outgoing events becomes mandatory. Moreover, a centralized source of schemas should exist for all systems to access and follow a consistent set of rules. Hence, Barclays engineering teams decided to adopt AWS Glue Schema Registry, which lets you centrally discover, control, and evolve schemas. Glue Schema Registry is a part of the AWS serverless data integration service called AWS Glue that makes it easy for analytics users to discover, prepare, move, and integrate data from multiple sources.
During the initial implementation of the AWS Glue Schema Registry integration, however, Barclays engineers observed that certain Lambda functions were discarding messages. On further analysis, engineers discovered that impacted AWS Lambda functions were using AWS Lambda event filtering, which currently doesn’t support an AWS Glue Schema Registry integrated payload. Nonetheless, AWS Lambda event filtering was a critical feature for Barclays in ensuring that AWS Lambda functions only listen to the relevant records, thereby decreasing unnecessary invocations and lowering overall costs.
Barclays reached out to AWS Enterprise Support for their guidance. While AWS Enterprise Support agreed that the use case was valid and could be added to the product roadmap, it would take time and impact Barclays’ adoption and overall time-to-market.
Fortunately, the AWS Glue Schema Registry integration library is open source. Barclays engineering teams decided to create a wrapper around the AWS library to customize their solution. In this post, we’ll dive into how Barclays achieved AWS Glue Schema Registry integration with AWS Lambda event filtering by leveraging the open source library.
Overview
We will use an example of event driven architecture for a generic customer onboarding flow presented in Figure 1 for the purpose of this blog post.
data:image/s3,"s3://crabby-images/7fea5/7fea5fe337e16ecba2bdc8adf65c214bf672dcbe" alt="Sample onboarding architecture"
Figure 1: Sample onboarding architecture.
Upon receiving the registration details (1), the “onboarding” AWS Lambda Function processes this request (2) and stores the customer’s data with an initial ‘draft’ status (3.1). Then, this onboarding AWS Lambda Function acts as a producer (an application that sends records in Amazon Kinesis Data Streams) and sends a ‘CustomerStatusChanged’ event on the stream (3.2).
The “identification and verification” AWS Lambda Function consumes (acting as consumer – an application that processes all data from Amazon Kinesis Data Streams) this event (4), applies a set of rules and saves the outcome.
After checks are completed, the “activation” AWS Lambda Function (5) is marking the customer as ‘approved’ (6.1) and produces another ‘CustomerStatusChanged’ event on the stream (6.2) that will be consumed by the “downstream processer” AWS Lambda Function (7) to onboard the customer on different downstream platforms.
Let’s understand the event structure of this ‘CustomerStatusChangedEvent’.
Here is the sample event in the JSON format:
The event is divided into two parts: (1) metadata and (2) body. The metadata contains details related to this event, and the body contains relevant information about the business domain.
You can see that there are some processes which get activated when the status is ‘draft’ (like identification and verification) and there are others which get activated when the status is ‘approved’ (like downstream systems).
AWS Lambda-based consumers use event filtering to ensure that the consumer Lambda Functions only get triggered when the customer is in the applicable state. This is shown in the diagram presented in Figure 2.
data:image/s3,"s3://crabby-images/11e33/11e334e938e7bd61d9bf481eb0458b6865571287" alt="AWS Lambda consumers diagram"
Figure 2. AWS Lambda based consumers with AWS Lambda based filtering.
An AWS Lambda event filter for consumers acting on the status of DRAFT is represented here.
Consumers acting on the status of APPROVED are represented here.
Challenge
As shown in Figure 1, the messages in the Amazon Kinesis Data Streams traverse through multiple systems. To ensure data integrity, every producer must validate the outgoing message before publishing it to the Amazon Kinesis Data Stream, while every consumer service must ensure that the incoming message has been validated.
Addressing this challenge requires the adoption of schemas on a large scale with AWS Glue Schema Registry. The introduction of the AWS Glue Schema Registry doesn’t require architectural changes (Figure 3) – this adoption process faced a roadblock when internal AWS Lambda-based consumers started dropping the payload when event filtering was enabled. Engineers discovered that AWS Lambda event filtering doesn’t support a glue schema registry integrated payload.
data:image/s3,"s3://crabby-images/86645/866453b466c93d8253b6d4702458518df9aff2be" alt="AWS Lambda based filtering with AWS Glue Schema Registry"
Figure 3. AWS Lambda based filtering with AWS Glue Schema Registry
On the producer side, the open source library fetches the schema from the AWS Glue Schema Registry and validates the outgoing message. Before putting the message on the stream, the producer encodes the message by pre-pending the magic number of schema version id from the deployed schema of the registry. The consumer uses the same schema version id prefix to retrieve the producer schema to deserialize the message.
In this process, the message becomes a non-JSON payload. The message fails the Lambda filtering on the consumer side (if we are using the Amazon Kinesis Stream filtering criteria) and gets dropped by the consumer.
You can see an example of the payload before serialization here:
And then after serialization here:
Based on the documentation for properly filtering Kinesis and DynamoDB messages, this was an expected behaviour. As highlighted in the below table, when the message is a non-JSON payload, the filtering action is to drop the record.
data:image/s3,"s3://crabby-images/7b5ee/7b5ee349d4a60fefcdbc789ac4e22ac2d89e5050" alt="data format table"
Solution
Barclays engineering teams support both JSON and Apache AVRO as acceptable message formats for their event-driven systems. However, JSON is more ubiquitous. It’s easy for developers to work with and is one of the most common data serialization formats used on the web. JSON is readable by humans and allows fast adoption. One added advantage of using a JSON-based format is Lambda event filtering for messaging payloads, which lets Barclays reduce traffic and overall costs. To maintain integration with Lambda event filtering, Barclays needed to ensure that the payload followed the JSON structure while integrating with AWS Glue Schema Registry. The optimal option Barclays found was to add this schema version id as a JSON node rather than as a prefix.
This design served the following purposes:
- On the consumer side, Barclays can still identify the producer schema with which the original message was validated.
- The message is still JSON compliant. This ensures that AWS Lambda consumers can still use content-based message filtering while continuing to adopt the AWS Glue Schema Registry.
It was immensely helpful that AWS provided the Glue Schema Registry integration library as open source. Barclays engineering teams decided to create a wrapper around the AWS library to support this customization. The customized version of serialization and deserialization was added within this wrapper library. Barclays extended key classes to add schema version id as a separate JSON node on the producer side and retrieval/deserialization of the payload on the consumer side.
Because the schema id was meta information for the payload, it was logical to add it as a child of the metadata node. You can see the resulting payload here:
Implementation guide
Let’s dive into how Barclays achieved AWS Glue Schema Registry integration with AWS Lambda event filtering. In the below examples, Barclays is using Java as the programming language. However, a similar implementation would be applicable to other languages as well.
Serialization
data:image/s3,"s3://crabby-images/6b73d/6b73dc3fe0ac0ce2583299ba311eb915ffb40702" alt="Serializer class diagram"
Figure 4: Serializer class diagram
Barclays decided to override GlueSchemaRegistrySerializer.java
on the producer side. The default serialization process is responsible for adding schemaVersionId
bytes as a prefix to the payload.
The default implementation of GlueSchemaRegistrySeralizerImpl
, as shown here, delegates a call to a facade. The facade is responsible for serializing the payload to add the schemaVersionId
as a prefix in payload bytes.
This default facade GlueSchemaRegistrySerializationFacade
, shown in here, is the main class where, in the encode() method, schemaVersionId
is fetched and payload is serialized with using SerializationDataEncoder.write()
.
The SerializationDataEncoder
class shown in here prepends schema version id in bytes and is added as a prefix to the payload.
To change this default behaviour and add the schemaVersionId
as a separate node, Barclays added a custom implementation of GlueSchemaRegistrySerializer
, as shown here.
Barclays extended the default serialization facade with their own version, as shown here.
Deserialization
data:image/s3,"s3://crabby-images/bea81/bea8161befeddcd1d02530e720e53c4170441a92" alt="Deserializer class diagram"
Figure 5. Deserializer class diagram
Barclays followed the same approach with the deserialization process that they did for the serialization process. They changed the GlueSchemaRegistryDeserializer.java interface on the consumer side to identify the schema version id to retrieve the schema from the registry and deserialize the incoming message.
The custom implementation of GlueSchemaRegistryDeserializer
shown in the first code block here can similarly delegate calls to custom facade implementations, as presented in the second code block, following.
The GlueSchemaRegistryDeserializationFacade
class contains the default implementation which deserializes the payload and removes schemaVersionId
prefix from the payload.
Barclays extended the default deserialization facade to provide their own version of the implementation. It uses the JSON node to identify whether the incoming message is integrated with the glue schema registry. CustomJsonDeserializer, shown here, retrieves the schemaVersionId added by the producer in the payload.
Conclusion
With the implemented solution, the payload is still formatted in JSON and AWS Lambda filtering works appropriately. It allows only applicable events to be passed from the producers to the consumers. Using the schemaVersionId
node, a consumer can identify that the incoming payload is integrated with AWS Glue Schema Registry and use its value to load producer-side schema to deserialize the payload.
With AWS open source libraries, Barclays managed to address the challenge with AWS Lambda and Amazon Kinesis integration when they adopted AWS Glue Schema Registry. By customizing an open source library Barclays improved their business delivery safely and helped continue their schema registry adoption without impacting AWS Lambda based functions using event filtering.
The code samples in this blog are for educational purposes only and may not reflect the actual production code. If you’d like to learn more about how Barclays manages tech and innovation, please check their innovation page.
This article was written by Barclays consumer banking team in collaboration with AWS.