AWS Startups Blog

How Mapbox Created a Flexible Artifact Building System Using AWS CodeBuild

Guest post by Devin Boyer, Software Engineer, Mapbox

Mapbox is a live location platform which enables developers to integrate maps, turn-by-turn navigation, and state of the art features like augmented reality driver assistance into their own applications. We process terabytes of data daily to keep our maps up-to-date and run high-performance API services in AWS regions around the world to ensure developers and end users always get the best experience.

In order to run these data processing pipelines and API backends, Mapbox developers need to be able to quickly and reliably deploy their applications. These include Docker images for APIs, AWS Lambda archives (or “bundles”) for serverless data processing, and packaging applications for Big Data analysis using Apache Spark. Over time, teams at Mapbox developed specific build tools for each type of artifact. Each of these tools worked slightly differently, provided limited configuration options, and were challenging to debug when artifact creation failed.

Last year, the Mapbox Platform team decided that in order to better serve developers at the company, we should combine all of our existing artifact-creation tools into a single-unified system. This system, which we named artifacts, provides an easy to use, extendable platform for creating any type of build artifact. We chose AWS CodeBuild to power artifacts because of its deep integration with the rest of the AWS infrastructure, it’s easy to use APIs, and its ability to customize the build environment on-the-fly.

The high-level architecture involves a GitHub application which makes requests to a webhook to trigger builds, a Lambda-backed Amazon API Gateway which receives and processes the webhooks, the various CodeBuild projects, and a polling system which sends status checks back to GitHub when builds complete or error.

Building Artifacts Using AWS Codebuild

This architecture makes artifacts a rather cost-effective solution. By utilizing serverless platforms like CodeBuild and Lambda with API Gateway, we don’t need to have services constantly running and consuming compute resources waiting for build requests. For a service which sees irregular usage patterns (for instance, on weekends), we only pay for exactly what we need.

We deploy the artifacts project to our AWS account using an AWS CloudFormation template which launches several different nested stacks. Each nested stack creates the resources necessary to build a particular artifact type, while the primary stack launches resources common to the overall project, like the router Lambda function. We make heavy use of our open-source project cloudfriend to create these templated nested templates. Our router API Gateway system is also created using a cloudfriend feature called “shortcuts” which will generate a CloudFormation template snippet containing the various resources required to launch a Lambda-backed API gateway.

Using Artifacts

A key part of the artifacts platform is the per-repo configuration file. Similar to many other Continuous Integration systems including CodeBuild itself, one defines which artifacts should be built and how by checking an .artifacts.yml file into a GitHub repository. Why build this level of abstraction when CodeBuild already supports customizable buildspec.yml files? By using our own custom config files, we can define sensible defaults and ensure artifacts are created in uniform manners across Mapbox repositories. For a simple application, a Docker image can be created and pushed to Amazon Elastic Container Registry (ECR) with a configuration file as short as this:

version: v1
defaults:
- docker-image

We automatically install the artifacts GitHub Application onto every repository created in our GitHub organization which means creating deployable build artifacts in a new repository can be as simple as checking a three-line .artifacts.yml code into GitHub and pushing a commit.

If the default configurations are not suitable for their project, developers can specify a variety of overrides for building artifacts. They can customize details about how the artifacts are build, such as targeting a different Lambda runtime like Python 3.6 instead of Node 8, or specifying multiple Dockerfiles which should be built simultaneously, for more complex projects. Different build configurations can be applied based on various repository conditions as well. Artifacts currently supports building artifacts only when a certain branch is pushed to or when a commit message matches a certain pattern, in addition to the default of creating artifacts for every commit.

We’ve built functionality into artifacts to perform some build operations that go beyond basic archiving of application code. An AMI artifact type exists, which uses Hashicorp’s Packer tool to create customized AMIs. Artifacts can also be sent to our AWS China region by including a single additional option in the configuration file. When enabled, completed build artifacts will be efficiently copied to China over an AWS Direct Connect network connection. This means our developers can push a commit and be able to deploy their change anywhere in the world within minutes.

Conclusion

Artifacts sees widespread usage at Mapbox, with about 6,000 builds being created each week. The architecture makes it very easy for other teams to extend artifacts to fit their own custom needs. Over 20 Mapbox engineers have contributed code to our artifacts project, adding new features and entirely new artifact types along the way.

We plan to continue to develop our artifacts platform to provide even more functionality in the future. Features on the artifacts roadmap include supporting mobile SDK builds and adding automated security and best practice checks into the build process. We’re confident that AWS CodeBuild will provide a consistent but customizable framework to continue to build on top of.

Interested in high-performance infrastructure projects? Join Us.

Do you enjoy the challenge of building tools that will be used by engineers across a growing company? Do you enjoy using a variety of AWS services, ranging from ECS and Lambda to Spot to SageMaker and sharing this knowledge with others? Mapbox is currently hiring platform engineers. Consider joining us!