Airbnb Uses Amazon EFS to Scale CI/CD Pipeline for Expanding Online Marketplace

QuadX

Online travel marketplace Airbnb supports hundreds of critical services on its platform, making it essential to maintain a reliable source control infrastructure. The company uses GitHub Enterprise for both source control and management of its continuous integration/continuous delivery (CI/CD) processes. Airbnb has more than 1,000 engineers, who execute more than 100,000 continuous integration jobs on an average working day. GitHub Enterprise provides the engineers with a single source of truth for all code repositories. 

However, source control infrastructure had become an operational headache due to the system’s scaling issues. In the previous system, each mirror instance pulled changes from GitHub Enterprise. This became challenging to maintain because the mirrors could get out of sync with each other. The system did not scale with Airbnb's increasing Git traffic and hindered the team from focusing on higher-level problem solving and implementing new features.

Airbnb sought a solution it could use to re-architect the source code infrastructure with a simpler storage layer. The system needed to update in seconds and read traffic needed to scale

“We had serious conversations about how to engineer and scale our source control infrastructure. Now, using Amazon EFS and Amazon SQS, we no longer worry about that. We know we can scale to match our growth.”

– Daniel Low, Software Engineer, Airbnb


  • About Airbnb
  • Benefits
  • AWS Services Used
  • About Airbnb
  • The mission of Airbnb is to create a world where anyone can belong anywhere. The company’s marketplace offers access to over 7 million unique accommodations worldwide, and its Experiences connect travelers to more than 40,000 unique, handcrafted activities.

  • Benefits
    • Uses single file system to sync GitHub repositories
    • Keeps Git mirrors in sync by using a shared file system that allows scaling of CI/CD processes
    • Ensures no repository changes are lost during syncing using event-driven queuing approach
    • Allows engineers to focus on building system features instead of worrying about scaling
  • AWS Services Used

Single File System Simplifies Git Mirror Instances

Airbnb turned to Amazon Web Services (AWS) to help achieve its goals. “We worked with AWS for many months to consider different options that could solve the challenges with our source control infrastructure,” says Joel Snyder, software engineer for Airbnb. “They took time to listen to what we required and gave us advice about how we could best integrate with their services.”

While exploring how to solve its challenges, Airbnb realized it could utilize Amazon Elastic File System (Amazon EFS), a simple, scalable system for Linux-based workloads for use with AWS Cloud services and on-premises resources. “Through our research, we discovered Amazon EFS, which we could use to share a single file system mounted on all Git mirror instances,” explains Snyder. “When one Git mirror changed, every other Git mirror was guaranteed to have the same update.” In this way, Airbnb used Amazon EFS to back the real-time image of its GitHub repository data. Mirrors were continuously in sync with the production repository at the scale required for the CI automation pipeline.

Queuing Service Helps Sync Repositories

Airbnb then needed a solution that could keep the GitHub Enterprise repository's file system in sync with the mirrored repository's file system. The team chose an event-driven approach, meaning only actual code changes to a repository prompted the repository syncing process. Airbnb turned to Amazon Simple Queue Service (Amazon SQS), a fully managed message queuing service, to avoid malformed or bad-data messages being committed to the production GitHub repository. “We used Amazon SQS as a queuing mechanism to buffer events from the GitHub primary to our syncing service,” says Daniel Low, software engineer for Airbnb.

Most of the time, repositories are automatically synced successfully. However, if a message can’t be handled automatically, Amazon SQS ensures no data is lost by redelivering messages until the repository is successfully synced. Each day, the system processes approximately 10,000 messages about code changes and repository syncing with a 99.5 percent success rate.

“When a bad message enters our system, we don’t constantly retry,” explains Low. “After five attempts, we move messages that can’t be successfully processed into a dead-letter queue.” Airbnb sets alerts for the dead-letter queue, so it can quickly discover the reason for the failed message and find a solution. “Once the issue is fixed, we replay the contents of the dead-letter queue back to Amazon SQS,” he says. “We use that process to verify that the issue was resolved.”

Engineers Focus on Building Features

“Before this migration, we had serious conversations about how to engineer and scale our source control infrastructure,” says Low. “Now, using Amazon EFS and Amazon SQS, we no longer worry about that. We know we can scale to match our growth.”

By using managed services such as Amazon EFS and Amazon SQS, Airbnb has significantly reduced the operational load required to maintain its source control infrastructure as its workload scales to tens of thousands of commits per day. “Our team can now focus on building new features and bringing value to our internal customers, the other engineers at Airbnb,” says Snyder. “Reliable source control is a crucial foundation to support Airbnb’s infrastructure and to enable Airbnb’s engineers to be productive.”


Learn More

To learn more, visit aws.amazon.com/efs.