Operating serverless at scale: Keeping control of resources – Part 3

This post is written by Jerome Van Der Linden, Solutions Architect.

In the previous part of this series, I provide application archetypes for developers to follow company best practices and include libraries needed for compliance. But using these archetypes is optional and teams can still deploy resources without them. Even if they use them, the templates can be modified. Developers can remove a layer, over-permission functions, or allow access to APIs without appropriate authorization.

To avoid this, you must define guardrails. Templates are good for providing guidance, best practices and to improve productivity. But they do not prevent actions like guardrails do. There are two kinds of guardrails:

Proactive: you define rules and permissions that avoid some specific actions.
Reactive: you define controls that detect if something happens and trigger notifications to alert someone or remediate actions.

This third part on serverless governance describes different guardrails and ways to implement them.

Implementing proactive guardrails

Proactive guardrails are often the most efficient but also the most restrictive. Be sure to apply them with caution as you could reduce developers’ agility and productivity. For example, test in a sandbox account before applying more broadly.

In this category, you typically find IAM policies and service control policies. This section explores some examples applied to serverless applications.

Controlling access through policies

Part 2 discusses Lambda layers, to include standard components and ensure compliance of Lambda functions. You can enforce the use of a Lambda layer when creating or updating a function, using the following policy. The condition checks if a layer is configured with the appropriate layer ARN:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ConfigureFunctions",
            "Effect": "Allow",
            "Action": [
                "lambda:CreateFunction",
                "lambda:UpdateFunctionConfiguration"
            ],
            "Resource": "*",
            "Condition": {
                "ForAllValues:StringLike": {
                    "lambda:Layer": [
                        "arn:aws:lambda:*:123456789012:layer:my-company-layer:*"
                    ]
                }
            }
        }
    ]
}

When deploying Lambda functions, some companies also want to control the source code integrity and verify it has not been altered. Using code signing for AWS Lambda, you can sign the package and verify its signature at deployment time. If the signature is not valid, you can be warned or even block the deployment.

An administrator must first create a signing profile (you can see it as a trusted publisher) using AWS Signer. Then, a developer can reference this profile in its AWS SAM template to sign the Lambda function code:

Resources:
  MyFunction:
    Type: AWS::Serverless::Function
    Properties:
      CodeUri: src/
      Handler: app.lambda_handler
      Runtime: python3.9
      CodeSigningConfigArn: !Ref MySignedFunctionCodeSigningConfig

  MySignedFunctionCodeSigningConfig:
    Type: AWS::Lambda::CodeSigningConfig
    Properties:
      AllowedPublishers:
        SigningProfileVersionArns:
          - arn:aws:signer:eu-central-1:123456789012:/signing-profiles/MySigningProfile
      CodeSigningPolicies:
        UntrustedArtifactOnDeployment: "Enforce"

Using the AWS SAM CLI and the --signing-profile option, you can package and deploy the Lambda function using the appropriate configuration. Read the documentation for more details.

You can also enforce the use of code signing by using a policy so that every function must be signed before deployment. Use the following policy and a condition requiring a CodeSigningConfigArn:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ConfigureFunctions",
            "Effect": "Allow",
            "Action": [
                "lambda:CreateFunction"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "lambda:CodeSigningConfigArn": "arn:aws:lambda:eu-central-1:123456789012:code-signing-config:csc-0c44689353457652"
                }
            }
        }
    ]
}

When using Amazon API Gateway, you may want to use a standard authorization mechanism. For example, a Lambda authorizer to validate a JSON Web Token (JWT) issued by your company identity provider. You can do that using a policy like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyWithoutJWTLambdaAuthorizer",
      "Effect": "Deny",
      "Action": [
        "apigateway:PUT",
        "apigateway:POST",
        "apigateway:PATCH"
      ],
      "Resource": [
        "arn:aws:apigateway:eu-central-1::/apis",
        "arn:aws:apigateway:eu-central-1::/apis/??????????",
        "arn:aws:apigateway:eu-central-1::/apis/*/authorizers",
        "arn:aws:apigateway:eu-central-1::/apis/*/authorizers/*"
      ],
      "Condition": {
        "ForAllValues:StringNotEquals": {
          "apigateway:Request/AuthorizerUri": 
            "arn:aws:apigateway:eu-central-1:lambda:path/2015-03-31/functions/arn:aws:lambda:eu-central-1:123456789012:function:MyCompanyJWTAuthorizer/invocations"
        }
      }
    }
  ]
}

To enforce the use of mutual authentication (mTLS) and TLS version 1.2 for APIs, use the following policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceTLS12",
      "Effect": "Allow",
      "Action": [
        "apigateway:POST"
      ],
      "Resource": [
        "arn:aws:apigateway:eu-central-1::/domainnames",
        "arn:aws:apigateway:eu-central-1::/domainnames/*"
      ],
      "Condition": {
        "ForAllValues:StringEquals": {
            "apigateway:Request/SecurityPolicy": "TLS_1_2"
        }
      }
    }
  ]
}

You can apply other guardrails for Lambda, API Gateway, or another service. Read the available policies and conditions for your service here.

Securing self-service with permissions boundaries

When creating a Lambda function, developers must create a role that the function will assume when running. But by giving the ability to create roles to developers, one could elevate their permission level. In the following diagram, you can see that an admin gives this ability to create roles to developers:

Developer 1 creates a role for a function. This only allows Amazon DynamoDB read/write access and a basic execution role for Lambda (for Amazon CloudWatch Logs). But developer 2 is creating a role with administrator permission. Developer 2 cannot assume this role but can pass it to the Lambda function. This role could be used to create resources on Amazon EC2, delete an Amazon RDS database or an Amazon S3 bucket, for example.

To avoid users elevating their permissions, define permissions boundaries. With these, you can limit the scope of a Lambda function’s permissions. In this example, an admin still gives the same ability to developers to create roles but this time with a permissions boundary attached. Now the function cannot perform actions that exceed this boundary:

The admin must first define the permissions boundaries within an IAM policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "LambdaDeveloperBoundary",
            "Effect": "Allow",
            "Action": [
                "s3:List*",
                "s3:Get*",
                "logs:*",
                "dynamodb:*",
                "lambda:*"
            ],
            "Resource": "*"
        }
    ]
}

Note that this boundary is still too permissive and you should reduce and adopt a least privilege approach. For example, you may not want to grant the dynamodb:DeleteTable permission or restrict it to a specific table.

The admin can then provide the CreateRole permission with this boundary using a condition:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "CreateRole",
            "Effect": "Allow",
            "Action": [
                "iam:CreateRole"
            ],
            "Resource": "arn:aws:iam::123456789012:role/lambdaDev*",
            "Condition": {
                "StringEquals": {
                    "iam:PermissionsBoundary": "arn:aws:iam::123456789012:policy/lambda-dev-boundary"
                }
            }
        }
    ]
}

Developers assuming a role lambdaDev* can create a role for their Lambda functions but these functions cannot have more permissions than defined in the boundary.

Deploying reactive guardrails

The principle of least privilege is not always easy to accomplish. To achieve it without this permission management burden, you can use reactive guardrails. Actions are allowed but they are detected and trigger a notification or a remediation action.

To accomplish this on AWS, use AWS Config. It continuously monitors your resources and their configurations. It assesses them against compliance rules that you define and can notify you or automatically remediate to non-compliant resources.

AWS Config has more than 190 built-in rules and some are related to serverless services. For example, you can verify that an API Gateway REST API is configured with SSL or protected by a web application firewall (AWS WAF). You can ensure that a DynamoDB table has back up configured in AWS Backup or that data is encrypted.

Lambda also has a set of rules. For example, you can ensure that functions have a concurrency limit configured, which you should. Most of these rules are part of the “Operational Best Practices for Serverless” conformance pack to ease their deployment as a single entity. Otherwise, setting rules and remediation can be done in the AWS Management Console or AWS CLI.

If you cannot find a rule for your use case in the AWS Managed Rules, you can find additional ones on GitHub or write your own using the Rule Development Kit (RDK). For example, enforcing the use of a Lambda layer for functions. This is possible using a service control policy but it denies the creation or modification of the function if the layer is not provided. You can use this policy in production but you may only want to notify the developers in their sandbox accounts or test environments.

By using the RDK CLI, you can bootstrap a new rule:

rdk create LAMBDA_LAYER_CHECK --runtime python3.9 \
--resource-types AWS::Lambda::Function \
--input-parameters '{"LayerArn":"arn:aws:lambda:region:account:layer:layer_name", "MinLayerVersion":"1"}'

It generates a Lambda function, some tests, and a parameters.json file that contains the configuration for the rule. You can then edit the Lambda function code and the evaluate_compliance method. To check for a layer:

LAYER_REGEXP = 'arn:aws:lambda:[a-z]{2}((-gov)|(-iso(b?)))?-[a-z]+-\d{1}:\d{12}:layer:[a-zA-Z0-9-_]+'

def evaluate_compliance(event, configuration_item, valid_rule_parameters):
    pkg = configuration_item['configuration']['packageType']
    if not pkg or pkg != "Zip":
        return build_evaluation_from_config_item(configuration_item, 'NOT_APPLICABLE',
                                                 annotation='Layers can only be used with functions using Zip package type')

    layers = configuration_item['configuration']['layers']
    if not layers:
        return build_evaluation_from_config_item(configuration_item, 'NON_COMPLIANT',
                                                 annotation='No layer is configured for this Lambda function')

    regex = re.compile(LAYER_REGEXP + ':(.*)')
    annotation = 'Layer ' + valid_rule_parameters['LayerArn'] + ' not used for this Lambda function'
    for layer in layers:
        arn = layer['arn']
        version = regex.search(arn).group(5)
        arn = re.sub('\:' + version + '$', '', arn)
        if arn == valid_rule_parameters['LayerArn']:
            if version >= valid_rule_parameters['MinLayerVersion']:
                return build_evaluation_from_config_item(configuration_item, 'COMPLIANT')
            else:
                annotation = 'Wrong layer version (was ' + version + ', expected ' + valid_rule_parameters['MinLayerVersion'] + '+)'

    return build_evaluation_from_config_item(configuration_item, 'NON_COMPLIANT',
                                             annotation=annotation)

You can find the complete source of this AWS Config rule and its tests on GitHub.

Once the rule is ready, use the command rdk deploy to deploy it on your account. To deploy it across multiple accounts, see the documentation. You can then define remediation actions. For example, automatically add the missing layer to the function or send a notification to the developers using Amazon Simple Notification Service (SNS).

Conclusion

This post describes guardrails that you can set up in your accounts or across the organization to keep control over deployed resources. These guardrails can be more or less restrictive according to your requirements.

Use proactive guardrails with service control policies to define coarse-grained permissions and block everything that must not be used. Define reactive guardrails for everything else to aid agility and productivity but still be informed of the activity and potentially remediate.

This concludes this series on serverless governance:

Standardization is an important aspect of the governance to speed up teams and ensure that deployed applications are operable and compliant with your internal rules. Use templates, layers, and other mechanisms to create shareable archetypes to apply these standards and rules at the enterprise level.
It’s important to keep visibility and control on your resources, to understand how your environment evolves and to be able to operate and act if needed. Tags and guardrails are helpful to achieve this and they should evolve as your maturity with the cloud evolves.

Find more SCP examples and all the AWS managed AWS Config rules in the documentation.

For more serverless learning resources, visit Serverless Land.

AWS Compute Blog

Operating serverless at scale: Keeping control of resources – Part 3

Implementing proactive guardrails

Controlling access through policies

Securing self-service with permissions boundaries

Deploying reactive guardrails

Conclusion

Resources

Follow

Learn

Resources

Developers

Help