AWS Partner Network (APN) Blog

BJSS Demonstrates the Viability of Building Low-Latency Trading Platforms on AWS

By Ryan Prins, Principal AWS Platform Consultant – BJSS
By Mohamed Zamzam, Global AWS Partnership Leader – BJSS

BJSS-AWS-Partners-2023
BJSS
Connect with BJSS-2

Nasdaq turned heads when it announced a collaboration with Amazon Web Services (AWS) to incorporate AWS Outposts to migrate its U.S. options market to a cloud-based infrastructure. This is a big bet on the feasibility of low-latency cloud-based solutions.

Traditionally, trading systems have been built with mammoth on-premises installations of servers and networking infrastructure. These have relied on physical proximity to achieve minimum transaction latency to provide traders with real-time opportunities and experience.

Financial institutions are bullish on the resiliency, scalability, and accessibility of cloud infrastructure offered by AWS.

BJSS is an AWS Advanced Tier Services Partner and leading software consulting group with extensive experience designing and implementing low-latency trading systems for financial institutions. BJSS has worked with some of the largest banks in the world and built some of the most widely-used trading platforms today.

As a technology leader in financial services, BJSS set out to demonstrate the feasibility of hosting a low-latency trading system in the cloud. In 2018, BJSS built a simple foreign exchange (FX) trading system and deployed it to a cloud environment, applying high performance tuning and robust controls for recording network latency and overall system latency.

Considering the recent acceleration of interest from financial institutions in adopting the use of cloud infrastructure, BJSS decided to re-run its performance testing to see how AWS has progressed in support of accelerated networking for low-latency trading.

System Under Test

BJSS designed its system under test according to a few basic principles:

  • Simple architecture: The system models a foreign-exchange immediate-or-cancel (IOC) transaction, isolating the network performance with minimal processing. The system is designed to be brokerless, reducing the number of points along the roundtrip transaction path.
  • Industry standard: The system is written in Java, an industry-standard language for enterprise trading systems, using standard industry messaging patterns including the LMAX Disruptor pattern for publishing and consuming messages.
  • Lowest latency: The system is designed to minimize network and process latency by applying various configurations to optimize networking performance and eliminate context switching and thread contention.
  • Accurate measurement: The system is designed with measurement points that isolate network latency and process latency, and to accurately measure each, without impact to the performance of the main process thread. In addition, BJSS eliminated the potential for clock skew to impact the measurements by measuring only timestamp differentials beginning and ending on the same host.

Resulting architecture:

BJSS-Financial-Trading-1

Figure 1 – System architecture under test.

The Order Generator (“OG”) exists to publish Order messages to the Order Book (OB). This delegates the process time required to create an order away from the OB itself, allowing BJSS to minimize process time measured and further reduce roundtrip latency measured.

After the OB has consumed the order, it sends the order to a Matching Engine (ME) server, which immediately responds by returning the order with a “Cancel” status. This simulates the most common action on an FX exchange.

BJSS-Financial-Trading-2

Figure 2 – AWS architecture.

Configuration

BJSS applied specific configuration to support latency goals at three levels:

  • Infrastructure: BJSS chose Amazon Elastic Compute Cloud (Amazon EC2) instance families and sizes to optimize configuration for low-latency networking and rapid processing and ran instances in a Cluster Placement Group to minimize network hops between instances.
    .
    The original 2018 configuration used the m5.12xlarge instance type, while the 2022 configuration used the c5n.9xlarge instance type, optimized for compute and networking. The m5.12xlarge instance type supports bandwidth up to 25 Gbps in a placement group using the Elastic Network Adapter (ENA), while the c5n.9xlarge is rated up to 50 Gbps.
    .
    Notably, the configuration does not use the Elastic Fabric Adapter (EFA), which could result in even higher network performance. The source code used for the initial test in 2018 was written to support a TCP socket interface, and the same code was reused for an equivalent comparison.
2018 2022
Instance type m5.12xlarge c5n.9xlarge
VCU (physical cores) 48 (24) 36 (18)
Memory 192 GiB 96 GiB
Network bandwidth 25* Gbps 50 Gbps
  • Virtual machine (VM) host: The host itself ran Red Hat Enterprise Linux (RHEL) 7.4, and BJSS applied a low latency tuning profile, including:
    • Disable hyperthreading
    • Disable unnecessary system processes
    • Run remaining system processes on isolated cores
    • Set ethernet thread affinity
    • Enable busy polling and busy read for ethernet
    • Enable TCP fast open and low latency settings
      .
  • Java runtime:
    • Set CPU taskset to avoid conflict with system processes
    • Set heap memory size and page size
    • Optimize garbage collection settings—tested using both the Z garbage collector (ZGC) and Shenandoah to minimize pause time

Test Parameters

In 2018, BJSS conducted tests ranging in duration from five minutes to seven days, finding little difference in median, mean, and 99th percentile latency.

Therefore, for the 2022 update BJSS limited tests to one hour and measured network and process latency at the same points.

Baseline Results

BJSS found significant improvement in latency from the 2018 benchmark at every measurement.

2018 2022
Instance type m5.12xlarge c5n.9xlarge
Message rate 10,000 msg/s 10,000 msg/s
Latency (min) 82 μs 42 μs
Latency (median) 109 μs 51 μs
Latency (mean) 235 μs 51 μs
Latency (99th percentile) 279 μs 62 μs

In the BJSS team’s experience, a 99th percentile latency of under 500μs is considered acceptable performance for many types of asset class trading. These results demonstrate that even under the conditions of the 2018 performance testing, cloud-native infrastructure had the capability to build a stable and performant trading platform.

BJSS-Financial-Trading-3

Figure 3 – Total system latency (2022 test baseline).

Subsequent advances in both networking and computer optimization, in hand with improvements in heap management, have further reduced network jitter and unpredictable spikes in processing time. In particular, the 99th percentile of latency has been reduced by 80%, which motivates migration of faster trading systems to AWS.

24-Hour Results

BJSS performed a 24-hour test at a 1,000 msg/s rate to test the consistency of system performance under varying network conditions.

BJSS-Financial-Trading-4

Figure 4 – 24-hour test (2022) at 1k msg/s.

Despite a slightly higher baseline latency, the performance over a full day was comparable to the performance of the one-hour test.

1-hr 24-hr
Instance type c5n.9xlarge c5n.9xlarge
Message rate 1,000 msg/s 10,000 msg/s
Latency (min) 42 μs 55 μs
Latency (median) 51 μs 65 μs
Latency (mean) 51 μs 66 μs
Latency (99th percentile) 62 μs 72 μs
Latency (99.9th percentile) 78 μs 81 μs

Noisy Neighbors

BJSS also discovered there was variance in the performance of the system, which the team suspected is related to “noisy neighbors” or other unrelated processes sharing the same physical host.

On several occasions, BJSS experienced suboptimal performance for an individual test, which was resolved by stopping (deallocating) and starting the virtual instances hosting the system under test.

BJSS-Financial-Trading-5

Figure 5 – Total system latency (noisy neighbor).

These tests resulted in both a decrease in baseline system performance and intermittent performance penalties of 25% or higher.

2022 baseline 2022 noisy neighbor
Instance type c5n.9xlarge c5n.9xlarge
Latency (min) 42 μs 240 μs
Latency (median) 51 μs 249 μs
Latency (mean) 51 μs 252 μs
Latency (99th percentile) 62 μs 297 μs

Conclusion

As BJSS has demonstrated in this post, hosting a trading platform for a major market in the cloud has long been technically feasible, and the last four years have only solidified the argument for migration of trading systems to cloud infrastructure.

The study BJSS performed should be used as confirmation of cloud capabilities by those weighing the cost and opportunity of a transition. Early adopters of cloud services have benefited from the advantages of improved resiliency, scalability, security, and flexibility. Those who choose today to build solutions in the cloud stand to achieve competitive performance as well.

As an experienced designer of low-latency trading platforms, BJSS offers digital architecture as well as roadmap and delivery process advisory to banking clients. BJSS can help accelerate the adoption of cloud-native solutions and help organizations take advantage of the flexibility, scalability, and resiliency of cloud-hosted infrastructure.

BJSS supports clients through every phase of their journey into running trading systems in the cloud. From discovery to implementation and delivery of trading solutions, BJSS aims to deliver a resilient trading solution that can scale elastically to meet market demands.

Learn more about BJSS offerings available in AWS Marketplace:

The content and opinions in this blog are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

.
BJSS-APN-Blog-CTA-2023
.


BJSS – AWS Partner Spotlight

BJSS is an AWS Advanced Tier Services Partner and leading software consulting group with extensive experience designing and implementing low-latency trading systems for financial institutions.

Contact BJSS | Partner Overview | AWS Marketplace