Customizing triggers for AWS CodePipeline with AWS Lambda and Amazon CloudWatch Events

AWS CodePipeline is a fully managed continuous delivery service that helps automate the build, test, and deploy processes of your application. Application owners use CodePipeline to manage releases by configuring “pipeline,” workflow constructs that describe the steps, from source code to deployed application, through which an application progresses as it is released. If you are new to CodePipeline, check out Getting Started with CodePipeline to get familiar with the core concepts and terminology.

Overview

In a default setup, a pipeline is kicked-off whenever a change in the configured pipeline source is detected. CodePipeline currently supports sourcing from AWS CodeCommit, GitHub, Amazon ECR, and Amazon S3. When using CodeCommit, Amazon ECR, or Amazon S3 as the source for a pipeline, CodePipeline uses an Amazon CloudWatch Event to detect changes in the source and immediately kick off a pipeline. When using GitHub as the source for a pipeline, CodePipeline uses a webhook to detect changes in a remote branch and kick off the pipeline. Note that CodePipeline also supports beginning pipeline executions based on periodic checks, although this is not a recommended pattern.

CodePipeline supports adding a number of custom actions and manual approvals to ensure that pipeline functionality is flexible and code releases are deliberate; however, without further customization, pipelines will still be kicked-off for every change in the pipeline source. To customize the logic that controls pipeline executions in the event of a source change, you can introduce a custom CloudWatch Event, which can result in the following benefits:

Multiple pipelines with a single source: Trigger select pipelines when multiple pipelines are listening to a single source. This can be useful if your organization is using monorepos, or is using a single repository to host configuration files for multiple instances of identical stacks.
Avoid reacting to unimportant files: Avoid triggering a pipeline when changing files that do not affect the application functionality (e.g. documentation files, readme files, and .gitignore files).
Conditionally kickoff pipelines based on environmental conditions: Use custom code to evaluate whether a pipeline should be triggered. This allows for further customization beyond polling a source repository or relying on a push event. For example, you could create custom logic to automatically reschedule deployments on holidays to the next available workday.

This post explores and demonstrates how to customize the actions that invoke a pipeline by modifying the default CloudWatch Events configuration that is used for CodeCommit, ECR, or S3 sources. To illustrate this customization, we will walk through two examples: prevent updates to documentation files from triggering a pipeline, and manage execution of multiple pipelines monitoring a single source repository.

The key concepts behind customizing pipeline invocations extend to GitHub sources and webhooks as well; however, creating a custom webhook is outside the scope of this post.

Sample Architecture

This post is only interested in controlling the execution of the pipeline (as opposed to the deploy, test, or approval stages), so it uses simple source and pipeline configurations. The sample architecture considers a simple CodePipeline with only two stages: source and build.

Example CodePipeline Architecture with Custom CloudWatch Event Configuration

The sample CodeCommit repository consists only of buildspec.yml, readme.md, and script.py files.

Normally, after you create a pipeline, it automatically triggers a pipeline execution to release the latest version of your source code. From then on, every time you make a change to your source location, a new pipeline execution is triggered. In addition, you can manually re-run the last revision through a pipeline using the “Release Change” button in the console. This architecture uses a custom CloudWatch Event and AWS Lambda function to avoid commits that change only the readme.md file from initiating an execution of the pipeline.

Creating a custom CloudWatch Event

When we create a CodePipeline that monitors a CodeCommit (or other) source, a default CloudWatch Events rule is created to trigger our pipeline for every change to the CodeCommit repository. This CloudWatch Events rule monitors the CodeCommit repository for changes, and triggers the pipeline for events matching the referenceCreated or referenceUpdated CodeCommit Event (refer to CodeCommit Event Types for more information).

Default CloudWatch Events Rule to Trigger CodePipeline

To introduce custom logic and control the events that kickoff the pipeline, this example configures the default CloudWatch Events rule to detect changes in the source and trigger a Lambda function rather than invoke the pipeline directly. The example uses a CodeCommit source, but the same principle applies to Amazon S3 and Amazon ECR sources as well, as these both use CloudWatch Events rules to notify CodePipeline of changes.

Custom CloudWatch Events Rule to Trigger CodePipeline

When a change is introduced to the CodeCommit repository, the configured Lambda function receives an event from CloudWatch signaling that there has been a source change.

{
   "version":"0",
   "id":"2f9a75be-88f6-6827-729d-34495072e5a1",
   "detail-type":"CodeCommit Repository State Change",
   "source":"aws.codecommit",
   "account":"accountNumber",
   "time":"2019-11-12T04:56:47Z",
   "region":"us-east-1",
   "resources":[
      "arn:aws:codecommit:us-east-1:accountNumber:codepipeline-customization-sandbox-repo"
   ],
   "detail":{
      "callerUserArn":"arn:aws:sts::accountNumber:assumed-role/admin/roleName ",
      "commitId":"92e953e268345c77dd93cec860f7f91f3fd13b44",
      "event":"referenceUpdated",
      "oldCommitId":"5a058542a8dfa0dacf39f3c1e53b88b0f991695e",
      "referenceFullName":"refs/heads/master",
      "referenceName":"master",
      "referenceType":"branch",
      "repositoryId":"658045f1-c468-40c3-93de-5de2c000d84a",
      "repositoryName":"codepipeline-customization-sandbox-repo"
   }
}

The Lambda function is responsible for determining whether a source change necessitates kicking-off the pipeline, which in the example is necessary if the change contains modifications to files other than readme.md. To implement this, the Lambda function uses the commitId and oldCommitId fields provided in the body of the CloudWatch event message to determine which files have changed. If the function determines that a change has occurred to a “non-ignored” file, then the function programmatically executes the pipeline. Note that for S3 sources, it may be necessary to process an entire file zip archive, or to retrieve past versions of an artifact.

import boto3

files_to_ignore = [ "readme.md" ]

codecommit_client = boto3.client('codecommit')
codepipeline_client = boto3.client('codepipeline')

def lambda_handler(event, context):
    # Extract commits
    old_commit_id = event["detail"]["oldCommitId"]
    new_commit_id = event["detail"]["commitId"]
    # Get commit differences
    codecommit_response = codecommit_client.get_differences(
        repositoryName="codepipeline-customization-sandbox-repo",
        beforeCommitSpecifier=str(old_commit_id),
        afterCommitSpecifier=str(new_commit_id)
    )
    # Search commit differences for files to ignore
    for difference in codecommit_response["differences"]:
        file_name = difference["afterBlob"]["path"].lower()
        # If non-ignored file is present, kickoff pipeline
        if file_name not in files_to_ignore:
            codepipeline_response = codepipeline_client.start_pipeline_execution(
                name="codepipeline-customization-sandbox-pipeline"
                )
            # Break to avoid executing the pipeline twice
            break

Multiple pipelines sourcing from a single repository

Architectures that use a single-source repository monitored by multiple pipelines can add custom logic to control the types of events that trigger a specific pipeline to execute. Without customization, any change to the source repository would trigger all pipelines.

Consider the following example:

A CodeCommit repository contains a number of config files (for example, config_1.json and config_2.json).
Multiple pipelines (for example, codepipeline-customization-sandbox-pipeline-1 and codepipeline-customization-sandbox-pipeline-2) source from this CodeCommit repository.
Whenever a config file is updated, a custom CloudWatch Event triggers a Lambda function that is used to determine which config files changed, and therefore which pipelines should be executed.

Example CodePipeline Architecture for Monorepos with Custom CloudWatch Event Configuration

This example follows the same pattern of creating a custom CloudWatch Event and Lambda function shown in the preceding example. However, in this scenario, the Lambda function is responsible for determining which files changed and which pipelines should be kicked off as a result. To execute this logic, the Lambda function uses the config_file_mapping variable to map files to corresponding pipelines. Pipelines are only executed if their designated config file has changed.

Note that the config_file_mapping can be exported to Amazon S3 or Amazon DynamoDB for more complex use cases.

import boto3

# Map config files to pipelines
config_file_mapping = {
        "config_1.json" : "codepipeline-customization-sandbox-pipeline-1",
        "config_2.json" : "codepipeline-customization-sandbox-pipeline-2"
        }
        
codecommit_client = boto3.client('codecommit')
codepipeline_client = boto3.client('codepipeline')

def lambda_handler(event, context):
    # Extract commits
    old_commit_id = event["detail"]["oldCommitId"]
    new_commit_id = event["detail"]["commitId"]
    # Get commit differences
    codecommit_response = codecommit_client.get_differences(
        repositoryName="codepipeline-customization-sandbox-repo",
        beforeCommitSpecifier=str(old_commit_id),
        afterCommitSpecifier=str(new_commit_id)
    )
    # Search commit differences for files that trigger executions
    for difference in codecommit_response["differences"]:
        file_name = difference["afterBlob"]["path"].lower()
        # If file corresponds to pipeline, execute pipeline
        if file_name in config_file_mapping:
            codepipeline_response = codepipeline_client.start_pipeline_execution(
                name=config_file_mapping["file_name"]
                )

Results

For the first example, updates affecting only the readme.md file are completely ignored by the pipeline, while updates affecting other files begin a normal pipeline execution. For the second example, the two pipelines monitor the same source repository; however, codepipeline-customization-sandbox-pipeline-1 is executed only when config_1.json is updated and codepipeline-customization-sandbox-pipeline-2 is executed only when config_2.json is updated.

These CloudWatch Event and Lambda function combinations serve as a good general examples of the introduction of custom logic to pipeline kickoffs, and can be expanded to account for variously complex processing logic.

Cleanup

To avoid additional infrastructure costs from the examples described in this post, be sure to delete all CodeCommit repositories, CodePipeline pipelines, Lambda functions, and CodeBuild projects. When you delete a CodePipeline, the CloudWatch Events rule that was created automatically is deleted, even if the rule has been customized.

Conclusion

For scenarios which need you to define additional custom logic to control the execution of one or multiple pipelines, configuring a CloudWatch Event to trigger a Lambda function allows you to customize the conditions and types of events that can kick-off your pipeline.

AWS DevOps Blog