How Yaala Labs Built a Cloud Native Exchange Platform on AWS

In this blog we discuss why Yaala Labs built their matching engine solution, P8, on AWS and explore the architectural decisions they made to achieve their performance and business objectives. We also share some of the benefits that Yaala Labs and their customers gained by running on AWS.

Who is Yaala Labs

Yaala Labs was founded in 2018 by a team of seasoned capital markets professionals that has been building and operating critical trading technology for close to two decades. From our time at MillenniumIT, where we implemented the matching technology for the London Stock Exchange and dozens of other regulated markets, we gained deep knowledge on the challenges of building and managing highly performant, reliable, and functionally rich systems. The cloud presented a great opportunity to reinvent exchange technology, accelerate time to market, increase flexibility, and lower the cost of ownership. So, we built P8, a platform that enables innovators to launch the next generation of marketplaces in the cloud.

Why Cloud Native?

Cloud native platforms are best positioned to meet the market’s increasing need for speed and agility. Systems and the institutions using them are evolving at an increasingly rapid pace. Technologies, like cloud computing, and development methods, like CI/CD pipelines and DevSecOps, have accelerated innovation. Development that once took years now happens in weeks. It’s imperative to get ideas to market quickly. Modern, cloud-native systems can be deployed globally in seconds, can elastically scale to adjust for days when volatility drives higher volumes, and are built on an underlying platform that constantly improves, and doesn’t become obsolete after a few years.

By building P8 from the ground up to be native to the cloud, Yaala Labs enables customers to provision the exact resources needed in a given moment, and dynamically scale up and down to meet the changing needs of their business. This reduces cost and improves an exchange’s ability to adjust to market conditions and meet their users’ demands with agility. This is in contrast to legacy matching engine technologies that require firms to rent space in a co-location facility and then purchase racks and stack physical servers to run their systems – a process that can take many months.

We also deployed on the cloud to take advantage of AWS service offerings such as Amazon Managed Streaming for Apache Kafka (MSK), and to use Infrastructure as Code (IaC) and Serverless Computing, such as AWS Lambda to minimize the amount of manual processes required to provision and run our systems. We found that these design decisions maximize scalability, fault tolerance, and security, at economical prices. Lastly, we ran P8 on AWS due to AWS’ industry-leading stability, advanced features and depth of functionality, as well as their widespread adoption within the financial services industry.

Objectives for P8 in the Cloud

We had eight high-level system objectives for P8. Our primary focus was to achieve industry-leading performance SLAs for a cloud-based matching engine, while offering the highest resiliency possible. It was also essential that our platform leverage best-in-class security architecture, take advantage of self-healing features where possible, and require zero human intervention for deployments and feature roll-outs. Lastly, we wanted P8 to minimize the total cost of ownership for our customers, accelerate their time to market, and support dynamic scalability so that they can handle market volatility more easily. To do this, we started with a ‘clean slate’. Our team had a strong background and extensive experience in on-premises, ultra-high-performance distributed systems. The challenge was to leverage this experience to build an architecture that is modern and cloud native that achieves the goals of our target market. P8 consists of multiple modules and components, which cater to a diverse set of demands from the infrastructure.

With these design approaches in mind, we started out by creating a reference architecture in AWS that relied heavily on serverless design patterns. We then enhanced this architecture by adding new capabilities to support key needs and constraints, such as low-latency matching and data caching. As we experimented with the platform, we would iteratively review and test our model against our goals, focusing heavily on scalability, fault-tolerance, security, and performance. We would often perform rapid POCs in certain areas, such as AWS Lambda latency measurements, to fail fast and iterate. To optimize the costs of our research and development process, we ensured that we never had dedicated or static testing environments. Everything was torn down and rebuilt from scratch, weekly. Once we landed on a target design for P8, we tested this design on a second cloud provider. We built solely on AWS, however, because the platform was most suitable for our needs and enabled us to focus all our resources on mastering this cloud platform.

How We Built P8 on AWS

Our use of AWS services was dictated by quite a few technology and business constraints, the most important of which was ensuring P8’s scalability, fault tolerance, and security capabilities met the robust requirements of powering a mission-critical trading system. Given the high-performance, high data throughput, and low-latency requirements of matching engines, as well as the critical role these systems play in powering systemically important market places, P8 had to be flexible, dynamic, resilient and performant.

One of the first challenges we solved was scalability. A matching engine needs to handle unpredictable periods of high market volatility and elevated order activity while honoring a few key constraints. Namely, all order messages must be sequenced and processed in the order they were received, and all trade messages must be returned in the order they were processed. In addition, the system must be able to process risk and exposure checks before trades can be matched. Unique foundational criteria, like user suspension, or events, like market open/close, must be processed before the matching engine can process order messages. Lastly, the system needs to process atomically linked events such as “one cancels other” orders or multi-legged strategy execution. These conditions restrict the use of parallel processing (horizontal scalability) for the core processing components and message dissemination interfaces.

To solve for scalability, we extensively used serverless, event-driven compute services, such as AWS Lambda, to underpin auto scaling functions across our platform. A good example of AWS Lambdas in action can be seen in how we disseminate market data to our GUI-based traders. Since multicast cannot be used over the internet and each subscriber receives customizable feeds (based on their subscription) the obvious choice is point to point (P2P) dissemination. We had to do this ensuring that instrument level events were processed in the order they were received at a scale when P8 was handling thousands of concurrent users subscribing to multiple instruments and the system could generate numerous market data events per instrument. To achieve this, we used AWS Lambda to underpin an auto scaler architecture that delivers market data to the UIs. We implemented the model in an “orchestrator – worker” pattern that scales to have as many orchestrators and workers as needed to meet a high level of message egress. We used a combination of AWS provisioned concurrency and a custom solution to keep Lambdas hot (i.e. AWS did not have the facility to ensure an AWS Lambda remains hot) to ensure the performance and latency were within acceptable limits. We also had to overcome the challenge of coordinating AWS Lambdas to ensure event ordering. We created a distributed locking mechanism using Amazon ElastiCache for Redis to control AWS Lambda synchronisation. Our engineering and use of AWS Lambdas allows our platform to auto-scale with demand increases at a faster rate than traditional auto scalers, and enables us to deliver messages to GUIs in <100ms.

A high-level illustration of the orchestration scheme is shown here:

Resiliency and redundancy were also a key factor that influenced our design approach. We wanted P8 to achieve an optimal level of fault tolerance while keeping costs low and minimizing architectural and operational complexity. One of the first decisions we needed to make was whether to operate across AWS Regions or Availability Zones (AZs). P8 had to operate in multiple locations because matching engines need synchronous replication of trades so that exchange operators can smoothly restart trading in another location, intra-day, in the case of a failure event. We studied both paths and deployed P8’s components in at least two Availability Zones but not across Regions. We carefully studied past cloud failure patterns. Given the low probability of an entire AWS Region failure coupled with the additional complexity and cost of building synchronous replication via the application layer (currently, certain AWS databases don’t provide syncronous cross-region replication), we determined this was not optimal for us.

In addition, we found that the high-speed replication across Availability Zones built into AWS services supplies high availability (HA) greater than traditional data center deployments and was more than adequate for our needs. For example, by using Amazon EKS Manage Node groups, we could achieve stateful, latency-critical workloads running across AZs, and stateless, load-sharing workloads running autonomously without AZ selection. This ensured that if a given AZ failed, P8’s components would be automatically restarted in another AZ, and that P8 would achieve sub-second fail-over times for high-specced nodes. We achieved similar fault tolerance performance at the data layer, where we use Amazon DocumentDB, Amazon DynamoDB, and Amazon Elastic File Storage (EFS) to store various system data, including transactions. The multi-AZ resiliency inherent to these services made the availability, consistency, and recovery capacities of the services exceed our RTO, RPO, and performance requirements. Lastly, for additional redundancy, P8 can also keep a lazy-synced data copy in another region if a client wants.

Security is another key focal point for us and our clients. We used AWS’s built-in encryption at rest and extensively used AWS services to secure and monitor the P8 infrastructure. We took advantage of services such as AWS WAF, AWS Shield, Network ACLs (NACLs), Security groups, AWS PrivateLink, NAT Gateway AWS Key Management System, and AWS CloudTrail, to secure, isolate, and monitor P8’s production environments. For example, we deploy P8’s components in many individual subnets and by utilizing NACLs and security groups, we strictly control ingress and egress to each subnet down to the specific port. This enhances P8’s security by reducing the impact in case of an intrusion. In addition, all internal traffic to the core subnets is encrypted and routed through AWS PrivateLink, which does point-to-point data encryption and prevents unauthorized intrusions and packet sniffs. Data is also encrypted at rest and managed with Customer Managed Keys through Amazon Key Management Service, and all external traffic is encrypted at the AWS API Gateway.

To separate duties and control system access, every AWS resource is permissioned through IAM policies, and all AWS Lambda functions are individually permissioned, by role. All resource provisioning is governed by our Infrastructure as Code (IaC) repositories. The IaC code is not accessible to application developers, and any requested changes are done through a strict change management process. We use AWS Cloud Trail to monitor all changes to the infrastructure and, to further eliminate log tampering opportunities, we utilize AWS Landing Zones to separate log and audit accounts.

Results

By building P8 on AWS we launched offerings for digital asset exchanges, carbon exchanges, cryptocurrency exchanges, and NFT marketplaces within a 12-month period while also launching a corporate bond marketplace that operates a distributed order book, hosted on a private blockchain. The key differentiator was AWS’s services, which helped us to innovate faster, reducing both cost and time to market. Many of our customers cite these benefits when working with us. For example, Carbon place chose Yaala labs to provide the core technology platform for their voluntary carbon market. Yaala’s Labs’ speed and P8’s quality were key to the decision, leading Scott Eaton, Carbon place’s CEO, to say, “in such a dynamic sector, time to market is key and Yaala Labs implemented our solution in a matter of months, demonstrating the versatility of their platform and how quickly they can launch new markets”.

In addition, we have also achieved all of our performance goals with AWS. The current version of the platform can scale from 100 msg/sec to 50,000 msg/sec with modest cost increments and we are adding more horizontal scalability capabilities to the product. In resiliency tests, we could simulate full AZ failures and the system recovered to full services levels in a matter of a few minutes.

As we continue to adjust, tune and enhance our platform to provide industry leading flexibility, reliability, performance, time to market and cost of ownership, we look forward to sharing more updates with the AWS community.

AWS for Industries