Building Windows containers with AWS CodePipeline and custom actions

Dmitry Kolomiets, DevOps Consultant, Professional Services

AWS CodePipeline and AWS CodeBuild are the primary AWS services for building CI/CD pipelines. AWS CodeBuild supports a wide range of build scenarios thanks to various built-in Docker images. It also allows you to bring in your own custom image in order to use different tools and environment configurations. However, there are some limitations in using custom images.

Considerations for custom Docker images:

AWS CodeBuild has to download a new copy of the Docker image for each build job, which may take longer time for large Docker images.
AWS CodeBuild provides a limited set of instance types to run the builds. You might have to use a custom image if the build job requires higher memory, CPU, graphical subsystems, or any other functionality that is not part of the out-of-the-box provided Docker image.

Windows-specific limitations

AWS CodeBuild supports Windows builds only in a limited number of AWS regions at this time.
AWS CodeBuild executes Windows Server containers using Windows Server 2016 hosts, which means that build containers are huge—it is not uncommon to have an image size of 15 GB or more (with .NET Framework SDK installed). Windows Server 2019 containers, which are almost half as small, cannot be used due to host-container mismatch.
AWS CodeBuild runs build jobs inside Docker containers. You should enable privileged mode in order to build and publish Linux Docker images as part of your build job. However, DIND is not supported on Windows and, therefore, AWS CodeBuild cannot be used to build Windows Server container images.

The last point is the critical one for microservice type of applications based on Microsoft stacks (.NET Framework, Web API, IIS). The usual workflow for this kind of applications is to build a Docker image, push it to ECR and update ECS / EKS cluster deployment.

Here is what I cover in this post:

How to address the limitations stated above by implementing AWS CodePipeline custom actions (applicable for both Linux and Windows environments).
How to use the created custom action to define a CI/CD pipeline for Windows Server containers.

CodePipeline custom actions

By using Amazon EC2 instances, you can address the limitations with Windows Server containers and enable Windows build jobs in the regions where AWS CodeBuild does not provide native Windows build environments. To accommodate the specific needs of a build job, you can pick one of the many Amazon EC2 instance types available.

The downside of this approach is additional management burden—neither AWS CodeBuild nor AWS CodePipeline support Amazon EC2 instances directly. There are ways to set up a Jenkins build cluster on AWS and integrate it with CodeBuild and CodeDeploy, but these options are too “heavy” for the simple task of building a Docker image.

There is a different way to tackle this problem: AWS CodePipeline provides APIs that allow you to extend a build action though custom actions. This example demonstrates how to add a custom action to offload a build job to an Amazon EC2 instance.

Here is the generic sequence of steps that the custom action performs:

Acquire EC2 instance (see the Notes on Amazon EC2 build instances section).
Download AWS CodePipeline artifacts from Amazon S3.
Execute the build command and capture any errors.
Upload output artifacts to be consumed by subsequent AWS CodePipeline actions.
Update the status of the action in AWS CodePipeline.
Release the Amazon EC2 instance.

Notice that most of these steps are the same regardless of the actual build job being executed. However, the following parameters will differ between CI/CD pipelines and, therefore, have to be configurable:

Instance type (t2.micro, t3.2xlarge, etc.)
AMI (builds could have different prerequisites in terms of OS configuration, software installed, Docker images downloaded, etc.)
Build command line(s) to execute (MSBuild script, bash, Docker, etc.)
Build job timeout

Serverless custom action architecture

CodePipeline custom build action can be implemented as an agent component installed on an Amazon EC2 instance. The agent polls CodePipeline for build jobs and executes them on the Amazon EC2 instance. There is an example of such an agent on GitHub, but this approach requires installation and configuration of the agent on all Amazon EC2 instances that carry out the build jobs.

Instead, I want to introduce an architecture that enables any Amazon EC2 instance to be a build agent without additional software and configuration required. The architecture diagram looks as follows:

Serverless custom action architecture

There are multiple components involved:

An Amazon CloudWatch Event triggers an AWS Lambda function when a custom CodePipeline action is to be executed.
The Lambda function retrieves the action’s build properties (AMI, instance type, etc.) from CodePipeline, along with location of the input artifacts in the Amazon S3 bucket.
The Lambda function starts a Step Functions state machine that carries out the build job execution, passing all the gathered information as input payload.
The Step Functions flow acquires an Amazon EC2 instance according to the provided properties, waits until the instance is up and running, and starts an AWS Systems Manager command. The Step Functions flow is also responsible for handling all the errors during build job execution and releasing the Amazon EC2 instance once the Systems Manager command execution is complete.
The Systems Manager command runs on an Amazon EC2 instance, downloads CodePipeline input artifacts from the Amazon S3 bucket, unzips them, executes the build script, and uploads any output artifacts to the CodePipeline-provided Amazon S3 bucket.
Polling Lambda updates the state of the custom action in CodePipeline once it detects that the Step Function flow is completed.

The whole architecture is serverless and requires no maintenance in terms of software installed on Amazon EC2 instances thanks to the Systems Manager command, which is essential for this solution. All the code, AWS CloudFormation templates, and installation instructions are available on the GitHub project. The following sections provide further details on the mentioned components.

Custom Build Action

The custom action type is defined as an AWS::CodePipeline::CustomActionType resource as follows:

  Ec2BuildActionType: 
    Type: AWS::CodePipeline::CustomActionType
    Properties: 
      Category: !Ref CustomActionProviderCategory
      Provider: !Ref CustomActionProviderName
      Version: !Ref CustomActionProviderVersion
      ConfigurationProperties: 
        - Name: ImageId 
          Description: AMI to use for EC2 build instances.
          Key: true 
          Required: true
          Secret: false
          Queryable: false
          Type: String
        - Name: InstanceType
          Description: Instance type for EC2 build instances.
          Key: true 
          Required: true
          Secret: false
          Queryable: false
          Type: String
        - Name: Command
          Description: Command(s) to execute.
          Key: true 
          Required: true
          Secret: false
          Queryable: false
          Type: String 
        - Name: WorkingDirectory 
          Description: Working directory for the command to execute.
          Key: true 
          Required: false
          Secret: false
          Queryable: false
          Type: String 
        - Name: OutputArtifactPath 
          Description: Path of the file(-s) or directory(-es) to use as custom action output artifact.
          Key: true 
          Required: false
          Secret: false
          Queryable: false
          Type: String 
      InputArtifactDetails: 
        MaximumCount: 1
        MinimumCount: 0
      OutputArtifactDetails: 
        MaximumCount: 1
        MinimumCount: 0 
      Settings: 
        EntityUrlTemplate: !Sub "https://${AWS::Region}.console.aws.amazon.com/systems-manager/documents/${RunBuildJobOnEc2Instance}"
        ExecutionUrlTemplate: !Sub "https://${AWS::Region}.console.aws.amazon.com/states/home#/executions/details/{ExternalExecutionId}"

The custom action type is uniquely identified by Category, Provider name, and Version.

Category defines the stage of the pipeline in which the custom action can be used, such as build, test, or deploy. Check the AWS documentation for the full list of allowed values.

Provider name and Version are the values used to identify the custom action type in the CodePipeline console or AWS CloudFormation templates. Once the custom action type is installed, you can add it to the pipeline, as shown in the following screenshot:

Adding custom action to the pipeline

The custom action type also defines a list of user-configurable properties—these are the properties identified above as specific for different CI/CD pipelines:

AMI Image ID
Instance Type
Command
Working Directory
Output artifacts

The properties are configurable in the CodePipeline console, as shown in the following screenshot:

Custom action properties

Note the last two settings in the Custom Action Type AWS CloudFormation definition: EntityUrlTemplate and ExecutionUrlTemplate.

EntityUrlTemplate defines the link to the AWS Systems Manager document that carries over the build actions. The link is visible in AWS CodePipeline console as shown in the following screenshot:

Custom action's EntityUrlTemplate link

ExecutionUrlTemplate defines the link to additional information related to a specific execution of the custom action. The link is also visible in the CodePipeline console, as shown in the following screenshot:

Custom action's ExecutionUrlTemplate link

This URL is defined as a link to the Step Functions execution details page, which provides high-level information about the custom build step execution, as shown in the following screenshot:

Custom build step execution

This page is a convenient visual representation of the custom action execution flow and may be useful for troubleshooting purposes as it gives an immediate access to error messages and logs.

The polling Lambda function

The Lambda function polls CodePipeline for custom actions when it is triggered by the following CloudWatch event:

  source: 
    - "aws.codepipeline"
  detail-type: 
    - "CodePipeline Action Execution State Change"
  detail: 
    state: 
      - "STARTED"

The event is triggered for every CodePipeline action started, so the Lambda function should verify if, indeed, there is a custom action to be processed.

The rest of the lambda function is trivial and relies on the following APIs to retrieve or update CodePipeline actions and deal with instances of Step Functions state machines:

CodePipeline API

AWS Step Functions API

You can find the complete source of the Lambda function on GitHub.

Step Functions state machine

The following diagram shows complete Step Functions state machine. There are three main blocks on the diagram:

Acquiring an Amazon EC2 instance and waiting while the instance is registered with Systems Manager
Running a Systems Manager command on the instance
Releasing the Amazon EC2 instance

Note that it is necessary to release the Amazon EC2 instance in case of error or exception during Systems Manager command execution, relying on Fallback States to guarantee that.

You can find the complete definition of the Step Function state machine on GitHub.

Step Functions state machine

Systems Manager Document

The AWS Systems Manager Run Command does all the magic. The Systems Manager agent is pre-installed on AWS Windows and Linux AMIs, so no additional software is required. The Systems Manager run command executes the following steps to carry out the build job:

Download input artifacts from Amazon S3.
Unzip artifacts in the working folder.
Run the command.
Upload output artifacts to Amazon S3, if any; this makes them available for the following CodePipeline stages.

The preceding steps are operating-system agnostic, and both Linux and Windows instances are supported. The following code snippet shows the Windows-specific steps.

You can find the complete definition of the Systems Manager document on GitHub.

mainSteps:
  - name: win_enable_docker
    action: aws:configureDocker
    inputs:
      action: Install

  # Windows steps
  - name: windows_script
    precondition:
      StringEquals: [platformType, Windows]
    action: aws:runPowerShellScript
    inputs:
      runCommand:
        # Ensure that if a command fails the script does not proceed to the following commands
        - "$ErrorActionPreference = \"Stop\""

        - "$jobDirectory = \"{{ workingDirectory }}\""
        # Create temporary folder for build artifacts, if not provided
        - "if ([string]::IsNullOrEmpty($jobDirectory)) {"
        - "    $parent = [System.IO.Path]::GetTempPath()"
        - "    [string] $name = [System.Guid]::NewGuid()"
        - "    $jobDirectory = (Join-Path $parent $name)"
        - "    New-Item -ItemType Directory -Path $jobDirectory"
                # Set current location to the new folder
        - "    Set-Location -Path $jobDirectory"
        - "}"

        # Download/unzip input artifact
        - "Read-S3Object -BucketName {{ inputBucketName }} -Key {{ inputObjectKey }} -File artifact.zip"
        - "Expand-Archive -Path artifact.zip -DestinationPath ."

        # Run the build commands
        - "$directory = Convert-Path ."
        - "$env:PATH += \";$directory\""
        - "{{ commands }}"
        # We need to check exit code explicitly here
        - "if (-not ($?)) { exit $LASTEXITCODE }"

        # Compress output artifacts, if specified
        - "$outputArtifactPath  = \"{{ outputArtifactPath }}\""
        - "if ($outputArtifactPath) {"
        - "    Compress-Archive -Path $outputArtifactPath -DestinationPath output-artifact.zip"
                # Upload compressed artifact to S3
        - "    $bucketName = \"{{ outputBucketName }}\""
        - "    $objectKey = \"{{ outputObjectKey }}\""
        - "    if ($bucketName -and $objectKey) {"
                    # Don't forget to encrypt the artifact - CodePipeline bucket has a policy to enforce this
        - "        Write-S3Object -BucketName $bucketName -Key $objectKey -File output-artifact.zip -ServerSideEncryption aws:kms"
        - "    }"
        - "}"
      workingDirectory: "{{ workingDirectory }}"
      timeoutSeconds: "{{ executionTimeout }}"

CI/CD pipeline for Windows Server containers

Once you have a custom action that offloads the build job to the Amazon EC2 instance, you may approach the problem stated at the beginning of this blog post: how to build and publish Windows Server containers on AWS.

With the custom action installed, the solution is quite straightforward. To build a Windows Server container image, you need to provide the value for Windows Server with Containers AMI, the instance type to use, and the command line to execute, as shown in the following screenshot:

Windows Server container custom action properties

This example executes the Docker build command on a Windows instance with the specified AMI and instance type, using the provided source artifact. In real life, you may want to keep the build script along with the source code and push the built image to a container registry. The following is a PowerShell script example that not only produces a Docker image but also pushes it to AWS ECR:

# Authenticate with ECR
Invoke-Expression -Command (Get-ECRLoginCommand).Command

# Build and push the image
docker build -t <ecr-repository-url>:latest .
docker push <ecr-repository-url>:latest

return $LASTEXITCODE

You can find a complete example of the pipeline that produces the Windows Server container image and pushes it to Amazon ECR on GitHub.

Notes on Amazon EC2 build instances

There are a few ways to get Amazon EC2 instances for custom build actions. Let’s take a look at a couple of them below.

Start new EC2 instance per job and terminate it at the end

This is a reasonable default strategy that is implemented in this GitHub project. Each time the pipeline needs to process a custom action, you start a new Amazon EC2 instance, carry out the build job, and terminate the instance afterwards.

This approach is easy to implement. It works well for scenarios in which you don’t have many builds and/or builds take some time to complete (tens of minutes). In this case, the time required to provision an instance is amortized. Conversely, if the builds are fast, instance provisioning time could be actually longer than the time required to carry out the build job.

Use a pool of running Amazon EC2 instances

There are cases when it is required to keep builder instances “warm”, either due to complex initialization or merely to reduce the build duration. To support this scenario, you could maintain a pool of always-running instances. The “acquisition” phase takes a warm instance from the pool and the “release” phase returns it back without terminating or stopping the instance. A DynamoDB table can be used as a registry to keep track of “busy” instances and provide waiting or scaling capabilities to handle high demand.

This approach works well for scenarios in which there are many builds and demand is predictable (e.g. during work hours).

Use a pool of stopped Amazon EC2 instances

This is an interesting approach, especially for Windows builds. All AWS Windows AMIs are generalized using a sysprep tool. The important implication of this is that the first start time for Windows EC2 instances is quite long: it could easily take more than 5 minutes. This is generally unacceptable for short-living build jobs (if your build takes just a minute, it is annoying to wait 5 minutes to start the instance).

Interestingly, once the Windows instance is initialized, subsequent starts take less than a minute. To utilize this, you could create a pool of initialized and stopped Amazon EC2 instances. In this case, for the acquisition phase, you start the instance, and when you need to release it, you stop or hibernate it.

This approach provides substantial improvements in terms of build start-up time.

The downside is that you reuse the same Amazon EC2 instance between the builds—it is not completely clean environment. Build jobs have to be designed to expect the presence of artifacts from the previous executions on the build instance.

Using an Amazon EC2 fleet with spot instances

Another variation of the previous strategies is to use Amazon EC2 Fleet to make use of cost-efficient spot instances for your build jobs.

Amazon EC2 Fleet makes it possible to combine on-demand instances with spot instances to deliver cost-efficient solution for your build jobs. On-demand instances can provide the minimum required capacity and spot instances provide a cost-efficient way to improve performance of your build fleet.

Note that since spot instances could be terminated at any time, the Step Functions workflow has to support Amazon EC2 instance termination and restart the build on a different instance transparently for CodePipeline.

Limits and Cost

The following are a few final thoughts.

Custom action timeouts

The default maximum execution time for CodePipeline custom actions is 24 hours. If your build jobs require more than that, you need to request a limit increase for custom actions.

Cost of running EC2 build instances

Custom Amazon EC2 instances could be even more cost effective than CodeBuild for many scenarios. However, it is difficult to compare the total cost of ownership of a custom-built fleet with CodeBuild. CodeBuild is a fully managed build service and you pay for each minute of using the service. In contrast, with Amazon EC2 instances you pay for the instance either per hour or per second (depending on instance type and operating system), EBS volumes, Lambda, and Step Functions. Please use the AWS Simple Monthly Calculator to get the total cost of your projected build solution.

Cleanup

If you are running the above steps as a part of workshop / testing, then you may delete the resources to avoid any further charges to be incurred. All resources are deployed as part of CloudFormation stack, so go to the Services, CloudFormation, select the specific stack and click delete to remove the stack.

Conclusion

The CodePipeline custom action is a simple way to utilize Amazon EC2 instances for your build jobs and address a number of CodePipeline limitations.

With AWS CloudFormation template available on GitHub you can import the CodePipeline custom action with a simple Start/Terminate instance strategy into your account and start using the custom action in your pipelines right away.

The CodePipeline custom action with a simple Start/Terminate instance strategy is available on GitHub as an AWS CloudFormation stack. You could import the stack to your account and start using the custom action in your pipelines right away.

An example of the pipeline that produces Windows Server containers and pushes them to Amazon ECR can also be found on GitHub.

I invite you to clone the repositories to play with the custom action, and to make any changes to the action definition, Lambda functions, or Step Functions flow.

Feel free to ask any questions or comments below, or file issues or PRs on GitHub to continue the discussion.

AWS DevOps Blog

Building Windows containers with AWS CodePipeline and custom actions

Considerations for custom Docker images:

Windows-specific limitations

CodePipeline custom actions

Serverless custom action architecture

Custom Build Action

The polling Lambda function

Step Functions state machine

Systems Manager Document

CI/CD pipeline for Windows Server containers

Notes on Amazon EC2 build instances

Start new EC2 instance per job and terminate it at the end

Use a pool of running Amazon EC2 instances

Use a pool of stopped Amazon EC2 instances

Using an Amazon EC2 fleet with spot instances

Limits and Cost

Custom action timeouts

Cost of running EC2 build instances

Cleanup

Conclusion

Resources

Follow