Unit testing IAM policies across multiple accounts

When migrating applications from a development account to a testing or production account, customers often find that AWS IAM policies or Service Control Policies (SCP) for their applications need significant modification to allow the application to deploy and function correctly. This can be a time-consuming process of discovery and remediation to get an application live in production. It may require a number of security exceptions to production IAM or SCP policies. This blog post demonstrates how to use pyunit to validate permissions across different accounts, allowing customers to find and remediate privilege problems in a consistent manner.

A large financial company, Example Corp., has a migration to AWS underway. It plans to centrally manage all IAM policies and roles across hundreds of accounts. Some of their accounts are project-based, some are environment-based, and others host shared services. Their security team will be the gatekeepers of the thousands of IAM policies and roles that are pre-established and managed separately from the application stacks and teams. Many of these policies are not application specific and are reused across different accounts and environments. IAM policies for their development accounts are less restrictive than those used for their production stacks.

Example Corp. regularly encounters issues deploying their AWS CloudFormation stacks in production. This is due to the more-restrictive IAM permissions of the production account. This leads to manual, retroactive IAM troubleshooting to deploy their production environment, while maintaining a least-privilege approach.

The problem and opportunity

Continuous delivery is a software development practice in which code changes are automatically prepared for release to production. A pillar of modern application development, continuous delivery expands upon continuous integration by deploying all code changes to a testing environment and/or a production environment after the build stage. When properly implemented, developers will always have a deployment-ready build artifact that has passed through a standardized test process.

However, when not properly implemented—a situation usually linked to a poor test process—many deployments end up still being manual. Typically, the IAM policies used by the application being deployed are not tested in the continuous integration and continuous delivery (CI/CD) pipeline. When the developers throw their code over to the production account, denials of requests due to IAM permissions might surface.

Think of an API request as shooting an arrow to meet a target. Hitting the target means the request is allowed, and various policies are barriers that the arrow must pass.

The first potential barrier is explicit denials in any of the policies, such as an s3:ListBucket denial, or one in subsequently evaluated policies. A request starts with a Deny—it’s only allowed if there is an Allow somewhere in the permission policies.

Any explicit Deny from the policies involved with an API call (which could include SCPs, permissions boundaries, permission policies, resource-based policies, and more) is evaluated first. They are at the “explicit deny” level in the following diagram:

Policy Evaluation Diagram

PyUnit provides an easy way to create unit testing programs and unit tests with Python, such as by creating a simple PyUnit script to unit test IAM Policies across environments. The PyUnit test takes a policy or an application and validates it against the target environment to ensure it is possible to grant an application the permissions it needs.

The problem of unit testing IAM policies across multiple accounts can be further broken up into two components:

Is there a way to automatically determine the exact least-privilege IAM permissions required for a given AWS CloudFormation template or desired action?
Are there any best practices around how to automate testing of the resulting stack?

This blog post focuses on the second question.

The approach

At the minimum, a simple deployment toolbox consists of:

Somewhere to store code, such as Subversion, CodeCommit, or GitHub
A code editor, such as Notepad, Vim, or IDE
A deployment tool, such as AWS CloudFormation, the AWS CLI, the AWS SAM CLI, or the AWS Cloud Development Kit
An AWS development account in which the testing suite can be executed

Example Corp. developers code locally and frequently push changes to AWS CodeCommit. They use the AWS SAM CLI and execute the test suite in their developer accounts. It all works very well for fast deployments and continuous testing, with extremely fast time to value. However, Example Corp‘s production environment is in a different account, which has SCPs set in place for compliance reasons. Multiple times in the past, deployments succeeded in the developer account test suite and broke in production for no apparent reason.

In developer accounts, something like the following code snippet is often used to unblock deployment:

# PLEASE DO NOT REPLICATE THIS POLICY IN YOUR ACCOUNT
{
  "Version": "2012-10-17",
  "Statement": [
    {
        "Effect": "Allow",
        "Action": "*",
        "Resource": "*"
    }
  ]  
}

Over-permissive policies are often used in non-production accounts for test purposes to minimize the effort of securing least-privilege access. However, production environments require least privilege. Example Corp. wants to be able to test that the role permissions for an application being deployed work as expected in the production account. It needs to be part of their CI/CD pipeline. The pipeline can assert that the required permissions are valid. If not, the developer can be alerted.

The application, built with AWS Serverless Application Model (AWS SAM) in Python, lists Amazon S3 buckets and receives a file containing the IAM permissions the application needs. For this application, the policy is the following:

{
    "Version": "2012-10-17",
    "Statement": [
        {
        "Effect": "Allow",
        "Action": [
            "s3:ListBucket"
        ],
        "Resource": "*"
        }
    ]
}

The account structure

A CI/CD pipeline for customers such as Example Corp. spans across multiple accounts. The account structure follows Landing Zone and Control Tower best practices. This example focuses on the following 4 accounts, as seen in the following diagram:

Developer account
Shared services account
Test account
Production account

Cross Account Pipeline

Developers check the code into a CodeCommit repository. It stores all the repositories as a single source of truth for application code. Developers have full control over this account, which is usually used as a sandbox for developers and has permissive policies to allow for fast development.

The shared services account acts as a central location for all the tools related to the organisation, including continuous delivery and deployment services such as AWS CodePipeline and AWS CodeBuild. Applications using the CI/CD orchestration for test purposes are deployed to the test account where CodeBuild runs PyUnit tests over the IAM policies. Those applications are then finally deployed to the production account, in which the automated test suite again runs tests against the IAM policies.

The customer experience

In this solution, the sample code for an AWS Lambda function is checked in to the developer account. This triggers the pipeline (created in CodePipeline in the shared services account) and runs the build using CodeBuild in the shared services account. The pipeline then deploys the Lambda function to the test and production accounts, in which it also tests the permission required to execute with CodeBuild.

The solution consists of:

An example AWS SAM application that lets you list Amazon S3 buckets
A reference architecture for a Cross Account AWS CodePipeline for IAM Policy Tester
A test suite that checks the application permissions

The application consists of an API gateway that acts as an endpoint for the request and invokes a Lambda function, listing the Amazon S3 buckets available in the production account.

The test_module.py is a PyUnit test module. It contains a unit test class defining the test method. If a permission is not granted, the assertion fails. The test module is invoked by CodeBuild in each of the accounts where the application is deployed.

response = client.simulate_principal_policy(
    PolicySourceArn=source, ActionNames=actions
)

The test uses the IAM Policy Simulator to validate IAM actions against IAM policies. The simulator evaluates the policies that you choose and determines the effective permissions for each of the actions that you specify. The simulator uses the same policy evaluation engine that is used during real requests to AWS services.

The policy source is the Amazon Resource Name (ARN) of a user, group, or role whose policies you want to include in the simulation. If you specify a user, group, or role, the simulation includes all policies that are associated with that entity. If you specify a user, the simulation also includes all policies that are attached to any groups the user belongs to. The principal is specified in the source.txt file and could be a test IAM user created by the security team.

The action names are in the actions.json file, which contains a list of names of API operations to evaluate in the simulation. Each operation is evaluated for each resource. Each operation must include the service identifier, such as s3:ListBucket.

Running the solution

To run the solution, execute the following steps:

1. Clone the GitHub repository

Clone the AWS Policy Tester Pipeline repository. From your terminal application, execute the following command:

git clone https://github.com/aws-samples/iam-policy-tester-pipeline

This creates a directory named iam-policy-tester-pipeline in your current directory.

2. Create AWS CodeCommit repository in the development account

Create a CodeCommit repository in the Development Account and set up AWS CLI if necessary. Name your repository sample-lambda. Alternatively, from your terminal application, execute the following command:

aws codecommit create-repository --repository-name sample-lambda --repository-description "Sample Lambda Function"

Note the cloneUrlHttp URL in the response from the CLI.

3. Add a new remote

From your terminal application, within the sample-lambda directory, execute the following command:

git init && git remote add AWSCodeCommit HTTP_CLONE_URL_FROM_STEP_2

Create the local git setup required to push code to CodeCommit repository.

4. Replace the Policy Source ARN

You need to specify the user, group, or role whose policies you want to include in the simulation. To do this, within the sample-lambda directory, modify the value in scripts/source.txt file.

5. Push the code to CodeCommit

From your terminal application, execute the following commands:

git add *

git commit -am "Initialise the SampleLambda repository"

git push AWSCodeCommit master

6. Run the script to generate the cross-account pipeline

From your terminal application, back in the iam-policy-tester-pipeline directory, execute the following command:

chmod +x single-click-cross-account-pipeline.sh && ./single-click-cross-account-pipeline.sh

This last step deploys the entire pipeline. It expects to receive the account numbers to which it will deploy the reference architecture. It creates Amazon S3 buckets for the build artifacts and encryption keys for secure cross-account communication, and sets up CodePipeline, CodeBuild, and CodeDeploy in the account structure described above. After this step, each update in the sample-lambda repository triggers an execution of the pipeline.

Cleanup

To clean up the solution, just delete the AWS CloudFormation stacks in each of the accounts specified in the script.

Conclusion

Example Corp. successfully runs their CI/CD pipeline, deploying their application to production. Recently, they made some updates on the code base and the application failed the test stage in the production account. The Example Corp. developer in charge of the deployment was notified that a ListBucket action failed in the test and was able to remediate the deployment by asking for the security team to investigate the problem. Due to a change in SCPs, a Deny List bucket policy was set in place. Example Corp’s developers were able to test that the role permissions for the application did not work in the target account as part of the pipeline, and are now informed of permission problems and can take remedial actions.

It’s simple to start integrating IAM policy testing into existing CI/CD pipelines. Developers can leverage a testing framework for their language of choice, and add the respective stages in the pipeline to automate the tests in each account. Following the templates in this blog post, developers can avoid hours of manual, retroactive IAM troubleshooting work to deploy to their production environment with a least-privilege approach.

About the authors

Eduardo Janicas is a Solutions Architect helping SMB customers in the UK use the AWS platform, specializing in Developer Tools and Containers. He enjoys distributed systems, travels and music festivals.

Connor Kirkpatrick is part of the AWS Solution Builders team in the UK. Connor works with the AWS Solution Architecture community to create standardized tools, code samples, demonstrations and quick starts.

Samuel Waymouth is a Solutions Architect helping SMB customers in the UK use the AWS platform, specializing in Security, Governance and Compliance. He enjoys security research with network intrusion detection systems and machine learning, cookery, motorcycles and martial arts.

AWS DevOps Blog