Networking & Content Delivery

Learn how LambdaTest transformed its global hybrid network using AWS

LambdaTest, a leading omnichannel test orchestration and execution cloud platform was looking to scale their multi-Region and hybrid networks. LambdaTest’s existing hybrid global network used AWS Site-to-Site VPN to connect their locations and Amazon Virtual Private Clouds (VPCs) across multiple AWS Regions. LambdaTest is growing rapidly, helping over 2 million developers run over 500 million tests, and supporting over 10,000 customers across more than 130 countries. To manage this growth, LambdaTest needed a long-term solution that could support its expanding global footprint over ten AWS Regions and more than ten locations.

As with most enterprises, there are several reasons for operating in multiple AWS Regions:

● Reducing Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO) as part of a multi-region disaster recovery (DR) plan;

● Expanding to a global audience by creating a better end-user experience through higher network performance, lower latency; and

● Routing traffic over the AWS Global Network by connecting to the nearest AWS Region or AWS point of presence, which could be AWS Global Accelerator, AWS Direct Connect, or Amazon CloudFront.

Here’s how they did it.

Who is LambdaTest?

The LambdaTest platform provides secure, scalable, and insightful test orchestration for customers at different points in their DevOps (CI/CD) lifecycle. Browser & App Testing, allows users to run both manual and automated tests of web and mobile apps across 3000+ different browsers, real devices, and operating system environments. HyperExecute, an AI-powered test orchestration cloud helps customers run and orchestrate test grids in the cloud for any framework and programming language at incredibly fast speeds to cut down on quality test time, helping developers build software faster.

LambdaTest is a customer-obsessed organization. This customer obsession drives continuous innovation at LambdaTest. This culture of innovation led to an initiative to make sure of optimal response time for their global user base.

Global WAN architecture

A few options were considered for creating the global WAN connectivity on AWS and extending it to LambdaTest locations using Site-to-Site VPN. Here’s some quick need-to-know information.

Amazon Virtual Private Cloud (Amazon VPC) allows you to launch resources in private networks in different AWS Regions and provides private routing between those AWS Regions across multiple accounts with VPC peering. These resources can communicate using private IP addresses and do not require an Internet Gateway (IGW), VPN, or separate network appliances.

VPC peering works well for networks with a small number of VPCs and few peering connections. However, transitive routing (sending traffic from VPC A to VPC B and then to VPC C) is not allowed. As the number of peered VPCs increases, the mesh of peered connections can become difficult to manage and troubleshoot.

AWS Transit Gateway reduces these difficulties by creating a virtual cloud-scaler router that connects multiple VPCs and on-premises networks. Transit Gateway’s routing capabilities can expand to additional AWS Regions with Transit Gateway inter-Region peering to create a globally distributed, private network for your resources.

LambdaTest implemented AWS Transit Gateway for multi-Region connectivity to achieve these key objectives:

  1. Avoid a mesh topology for connecting multiple AWS Regions and on-premises locations.
  2. Use the AWS global network for consistent and optimized network performance.
  3. Reduce onboarding time for new customers.
  4. Create scalable and reliable network connectivity across multiple AWS Regions.
  5. Reduce the number of Site-to-Site VPNs between AWS and appliances present in LambdaTest locations to optimize costs.

Migrating from mesh to hub and spoke topology with AWS Transit Gateway

LambdaTest serves its global customer base using co-location facilities with servers and a modernized control plane on AWS, which consists of:

  1. Core Regions: An AWS Region with all services and nearest to the on-premises locations.
  2. Supporting Regions: AWS Regions that have some microservices but don’t have direct connections to on-premises locations. They are there to balance the workload in Core Regions.
  3. Proxy Regions: These are the nearest touchpoints for end customers on the LambdaTest cloud on AWS. The AWS Global Network is used to pass customer requests to Core/Supporting Regions, while an immediate response is sent from the Proxy Region.

LambdaTest’s core application requires connectivity across all AWS Regions, including the Core and Supporting Regions, as well as to their on-premises locations. LambdaTest was using VPC Peering for multi-region connectivity, and connectivity to on-premises locations was provided using IPsec Site-to-Site VPN from each VPC using a Virtual Private Gateway.

With their hyper business growth, LambdaTest saw not only enterprise customer adoption of their services in AWS but also a similar growth pattern for their on-premises side of the application. This led them to create close to 50 IPsec Site-to-Site VPNs from each AWS Region to their on-premises locations, as they run a separate hardware platform for each customer at various locations.

This led to the exhaustion of the IPSec VPN limit per Virtual Gateway, and they started observing other operational and performance challenges in the form of aggregate bandwidth for their connectivity to on-premises networks. Creating new VPNs for customers in all AWS Regions was also increasing the cost of running their platform. This architecture is shown in the following diagram (Figure 1).

Extending connectivity from Co-location/Data Center to AWS on VGW

Figure 1: VPNs from each co-location terminating to individual VGWs for each AWS Region

To solve these challenges, LambdaTest used Transit Gateway to connect their VPCs within each AWS Region. This provided inter-Region connectivity and connected to on-premises networks. Initially, there were VPNs from each location to all supporting and Core Regions. After migration, VPNs were only required for connections to the Transit Gateways in each Core Region. Connectivity to Supporting and Proxy Regions uses inter-Region Transit Gateway peering. For high availability, they extended VPNs from each on-premises location to multiple Core Regions, so that the issue in one Core Region should not affect the application connectivity from on-premises locations. This architecture is shown in the following diagram (Figure 2).

Using the AWS Global Network to optimize network performance and security

Extending connectivity from Co-location/Data Center to AWS on TGW

Figure 2: VPNs from each co-location terminating to only the Core Region’s Transit Gateway

The AWS Global Cloud Infrastructure is the most secure, extensive, and reliable cloud network, offering over 200 fully featured services from data centers globally. Whether you need to deploy your application workloads across the globe in a single click, or you want to build and deploy specific applications closer to your end-users with single-digit millisecond latency, AWS provides the cloud infrastructure where and when you need it.

LambdaTest chose to build on the AWS network backbone because of its high speed and lower latency compared to the internet. Using the AWS Regions that are closest to their end users, they send data over the AWS Global Network backbone and route the traffic to their Core Regions. This increased application performance and delivered a better user experience.

Simpler and faster on-boarding of new AWS Regions and on-premises co-location

LambdaTest has now expanded its workloads to additional AWS Regions. By using Transit Gateway inter-region peering, they can extend their Core or Supporting Regions and establish connectivity between AWS Regions in minutes. Now, LambdaTest doesn’t have to create new IPSec Site-to-Site VPNs with every Supporting Region expansion. All they need to do is create a Transit Gateway peering connection for inter-region connectivity. They automate this process using Terraform scripts, making them more agile. This architecture is shown in the following diagram (figure 3).

On-boarding of new co-location and extending connectivity from Co-location/Data Center to AWS on TGW

Figure 3: Expansion of Core Regions, Supporting Regions, and co-locations

Similarly, expanding to new on-premises locations and establishing connectivity to Core and Supporting Regions is easier with this new architecture. Whenever there is a need to extend the Core Region connectivity to a new on-premises location, it is done using IPSec Site-to-Site VPN. They provide connectivity to Supporting Regions using the same VPN connection. There is no need to create additional VPNs for Supporting Regions, as they connect all Core Regions to the Supporting Regions through the Transit Gateway peering network. This reduces the number of VPNs and the overall cost of the solution. LambdaTest also saves time by reducing the number of VPN configurations needed to establish connectivity with multiple AWS Regions.

Conclusion

Using Transit Gateway, LambdaTest has established connectivity to multiple AWS Regions and their customers’ on-premises locations at scale. Their data is automatically encrypted as it travels between AWS Regions and never traverses the public internet when using Transit Gateway peering. Limits that were previously a challenge with VPNs and VPC Peering have been removed. Transit Gateway scales as the number of network connections grows. They have reduced the requirement of multiple VPNs to Supporting Regions, which saves 25% of their effort during deployment.

The content and opinions in this post include those of the third-party author and AWS is not responsible for the content or accuracy of this post.

About the authors

Shahid Ali Khan

Shahid Ali Khan is a Lead Technical Staff member at LambdaTest, where he has driven DevOps excellence since 2017 to enable seamless software releases. A passionate technologist, he leads transformative projects, most recently spearheading DevOps for LambdaTest’s HyperExecute intelligent test orchestration platform. An innovative leader, he applies his dedication beyond work as an avid traveler, vocalist, and guitarist.

Vibhu Pareek

Vibhu Pareek is a Solutions Architect at Amazon Web Services. Since joining AWS in 2016, he has specialized in guiding customers on adopting cloud through implementing well-architected, repeatable patterns and solutions that drive innovation. Vibhu has a strong interest in open source databases including PostgreSQL. Outside of work, he enjoys football.

Avanish Yadav

Avanish Yadav is a Senior Networking Solutions Architect at Amazon Web Services. With a passion for networking technologies, he enjoys innovating and helping customers solve complex technical challenges by creating secure, scalable cloud architectures. When he’s not collaborating with clients to provide expert solutions to their needs, he can often be found playing cricket outside of work.