Integration & Automation
How to automate your software-composition analysis on AWS
To keep pace with innovation on Amazon Web Services (AWS), many customer-application teams experiment with publicly available software packages that may have known vulnerabilities that can expose your environments to different threats. In this blog post, we discuss an operation that reduces the risk of downloading new packages from public repositories.
There are three common implementations when working with public repositories:
- Using a public repository directly.
- Using a hybrid repository that contains internal packages and a proxy to a public repository that acts as an extension.
- Using an isolated, internal repository.
Using a hybrid repository or a public repository directly can result in downloading vulnerable software packages, but an internal repository gives you full control over the software packages. Manual software composition analysis (SCA) procedures, however, may slow down the development speed. Hence we use an automated pipeline that can initiate a download-scan-upload procedure.
About this blog post | |
Time to read | ~10 min. |
Time to complete | ~30 min. |
Cost to complete | $0 |
Learning level | Advanced (300) |
AWS services | AWS Cloud Development Kit (AWS CDK) AWS CodeBuild AWS CodeCommit AWS CodeArtifact AWS Lambda AWS Systems Manager Parameter Store |
Overview
The following steps automate a vulnerability scan of a new public package:
- Capture events when adding new packages into the application’s
requirements.txt
file. - Process the changes, and verify that the package does not exist in the current AWS CodeCommit repository.
- AWS CodeBuild downloads the package, scans it, and uploads to the internal repository.
- If the package (or package dependencies) contains any vulnerabilities, the build fails and generates a detailed report.
Figure 1 shows the architecture and flow of automation.

Figure 1. Architecture diagram for software-composition analysis
Architecture flow
- A new or updated package is added to the requirements file (for example, in Python, it is
requirements.txt
) - After you commit and push your changes to the AWS CodeCommit repository, a trigger invokes the obtain-changes Lambda function. The function contains Python code that extracts only the changes, which avoids the overhead of working through the dependency list.
- The obtain-changes Lambda function triggers the compare-changes Lambda function, which compares your changes to the existing repository.
- If the package already exists, there are no further actions.
- Otherwise the compare-changes Lambda function invokes an AWS CodeBuild project.
- AWS CodeBuild takes the added or changed packages and downloads them from the public repository (in this case, Python uses
pypi.org
) into a sandbox container. - Snyk scans the packages for known vulnerabilities.
- If the scan succeeds, it is uploaded to AWS CodeArtifact and ready for use.
- If the scan fails, a detailed report is generated for further investigation.
Prerequisites
Before getting started, ensure that you have the following:
- An AWS account. Use a Region that supports AWS CodeCommit, AWS CodeBuild, and AWS CodePipeline. For more information, see AWS Regional Services.
- A basic understanding of the following AWS Services:
- AWS CodeBuild
- AWS CodeCommit
- AWS CodeArtifact
- AWS Lambda
- AWS Systems Manager Parameter Store
- A Snyk account (note that Snyk is third-party software).
- A basic understanding of Python.
- A basic understanding of Git.
- A basic understanding of CDK environments.
Walkthrough
Code overview
The Python code in this post was written using the AWS CDK. To view the code, see the associated GitHub repository. If you’re unfamiliar with AWS CDK, see Getting started with AWS CDK.
For the deployment, use an AWS CDK construct in the code base iac/iac/constructs/ci.py
and the AWS CDK stack iac/iac/stacks/stack.py
. The construct requires the following parameters:
- Parameter Store name of the Snyk organization ID.
- Parameter Store name for the Snyk authentication token.
- (Optional) Parameter for the AWS CodeCommit repository name.
For more information about authenticating Snyk, see Authenticate the CLI with your account. Set up your organization ID and authentication token before deploying the stack—because these are confidential and sensitive, deploy them as a separate stack or manual process. In this solution, the parameters are stored as a SecureString
parameter type and encrypted using AWS Key Management Service (AWS KMS). For more information, see AWS KMS keys concepts.
In the Snyk console, create the organization ID and authentication token. To add these parameters, navigate to the Settings page, and choose General, as shown in figure 2.

Figure 2. Snyk settings
Navigate to the AWS Systems Manager Parameter Store, choose the Overview tab, and retrieve the parameter names, as shown in figure 3.

Figure 3. Parameter names for SnykAuth
SCA scanning
AWS CodeBuild uses Snyk to trigger an SCA scan (we use the open-source versions of Snyk). Because this solution is modular, you can integrate your own SCA tool. If there are any vulnerabilities, the build fails and uploads the output to Amazon Simple Storage Service (Amazon S3). If there are no vulnerabilities, it proceeds to the next stage.
Deploying
Clone the CDK code from the GitHub repository: