FSI Services Spotlight: Featuring AWS Lambda

In this edition of Financial Services Industry (FSI) Services Spotlight monthly blog series, we highlight five key considerations for customers running workloads on AWS Lambda: achieving compliance, data protection, isolation of compute environments, audits with APIs, and access control/security. Across each area, we will examine specific guidance, suggested reference architectures, and technical code to help streamline service approval of AWS Lambda.

AWS Lambda is a serverless compute service that removes the heavy lifting of provisioning and managing underlying infrastructure to enable teams to build and deploy new application and functionality quickly. AWS Lambda natively supports seven programming languages: Java, Go, PowerShell, Node.js, C#, Python, and Ruby. It further supports other languages through custom runtimes.

In order to use AWS Lambda, the code needs to be either uploaded as a container image or as a .zip file archive. Further configurations include the amount of memory allocated to the function (128 MB – 10240 MB), the execution time of the function’s handler method (maximum timeout is 15 minutes), the type of processor (x86 or ARM) as well as the function’s provisioned concurrency, which keeps functions initialised and hyper-ready to respond in double-digit milliseconds. For a full list of configurations, see the Lambda documentation.

How AWS Lambda works

AWS Lambda works in two ways: Either an event drives the invocation or Lambda polls a queue or data stream and invokes the function in response to the associated activity.

The event-driven invocation can be synchronous or asynchronous. For synchronous invocation, the service that generates the event waits for the response from your function. A common use-case is the synchronous invocation from an Amazon API Gateway to a Lambda function, where the caller waits for the response. For asynchronous invocation, Lambda queues the event before passing it to your function. An example for it is the asynchronous invocation of a Lambda function via an Amazon S3 event such as a new object uploaded, where S3 will not be informed about the status of the event going forward once Lambda has picked up the event. The second option to invoke a Lambda function is for services that generate a queue or data stream such as Amazon Simple Queue Service (SQS) or Amazon Kinesis. You set up an event source mapping in Lambda to have Lambda poll the queue or a data stream as they do not invoke the Lambda function directly.

Lambda integrates with 29 AWS services natively, such as Amazon S3 to invoke a function in response to resource lifecycle events, Amazon API Gateway to respond to incoming HTTP requests, Amazon SQS to consume events from a queue or Amazon EventBridge to run a function based on a schedule.

In AWS Lambda you will be billed only for the time your Lambda function is running, which is a strong advantage over compute engines being permanently running without processing requests.

Figure 1: AWS Lambda Overview

AWS Lambda use cases

AWS Lambda has many use-cases including data processing, real-time file and streaming processing, backend computing and remediation tasks.

In financial institutions, AWS Lambda is often used for remediation actions. For example, it is listening to events coming from AWS Config, which is a service assessing and reporting the configurations of your AWS resources. If AWS Config detects, for example, an Amazon Elastic Compute Cloud (Amazon EC2) instance which has been launched without assigning the instance a required tag, the event issued from AWS Config invokes an AWS Lambda function to remediate the event. This can either be to add the required tag and let the instance continue to run or to stop the instance. In both cases, AWS Lambda can issue a message via AWS Simple Notification Service (Amazon SNS) to the operators informing them about the action taken.

This solution can further be extended to the usage of AWS Security Hub, which is a service that gives you aggregated visibility into your security and compliance status across multiple AWS accounts. In addition to consuming findings from Amazon services and integrated partners, Security Hub gives you the option to create custom actions, which allow a customer to manually invoke a specific response or remediation action on a specific finding. This solution has been described in a blog post from Jonathan Rau. The advantage of using AWS Security Hub with Lambda is that by creating custom actions mapped to specific finding types and by developing a corresponding Lambda function for that custom action, you can achieve targeted, automated remediation for these findings.

An example of an implementation for automated remediation solution with Security Hub can be found at ERGO Group, a leading insurer in Germany. ERGO was looking for a solution that could enable the management of security events at scale. At the same time, they needed to centralize the event response and remediation in near-real time using AWS Step Functions and AWS Lambda. The goal was to improve their CIS compliance metric and overall security posture. To learn more you can read in the blog post by Sid Singh and Adam Sikora.

Another use-case for AWS Lambda is the loading and processing of real-time streaming data as done by Thomson Reuters in their Product Insight application, a solution to capture, analyse, and visualise analytics data. In their application, AWS Lambda is collecting data from Amazon Kinesis pipeline for streaming data and loading it into the primary dataset in Amazon S3. Lambda is also triggered by Amazon S3’s data notifications whenever new data is stored, and performs the additional transformations on the primary dataset. The advantage of Lambda is that it runs code only when triggered by data via integrations with Kinesis or Amazon S3, and it charges for compute processing only when the code is running.

Financial Engines, the largest independent investment advisor in the United States in terms of assets under management, is using AWS Lambda to run their integer programming optimizer (IPO) engine without provisioning or managing any servers. They have decided to use AWS Lambda over Amazon Elastic Compute Cloud (Amazon EC2) because Lambda scales precisely with the workload. Using Lambda as the computational backend for the product applications is a very common use case in the financial services industry. As a result of using AWS Lambda, Financial Engines achieved cost savings of 90% in their infrastructure costs, scalability without provisioning or managing servers, and improved performance.

Lastly, FINRA the Financial Industry Regulatory Authority which oversees daily up to 75 billion market events daily, adopted AWS Lambda serverless computing to make data validation more efficient. In FINRA’s Order Audit Trail System (OATS), AWS Lambda is being used to ensure that data is complete and correctly formatted according to a set of more than 200 rules. They have chosen AWS Lambda over Amazon EC2 and Amazon EMR based on scalability, data partitioning, monitoring, performance, cost, and maintenance requirements. Security was also crucial: Encryption of data in motion, which prevents FINRA from using plain HTTP connections to transfer information, as well as at rest using server-side managed key encryption. By adopting AWS Lambda, FINRA was able to increase cost efficiency by a factor of two compared to their previous system.

Achieving compliance with AWS Lambda

Security is a shared responsibility between AWS and our customers. AWS is responsible for protecting the infrastructure that runs AWS services in the AWS Cloud and also provides customers with services that they can use securely. The customer responsibility is determined by the AWS service that they use. On the customer’s side of the shared responsibility model, customers should first determine their requirements for network connectivity, encryption, and access to other AWS resources. We will dive deeper into those topics in the upcoming sections.

AWS Lambda falls under the scope of the following compliance programs with regards to AWS’s side of the shared responsibility mode. In following sections, we will cover topics on the customer side of the shared responsibility model.

SOC 1,2,3
PCI
ISMAP
FedRAMP Moderate and FedRAMP High
DoD CC SRG IL2 through IL6
HIPAA BAA
IRAP
MTCS (check regions)
C5
K-ISMS
ENS High
OSPAR
HITRUST CSF
FINMA
GMS

Isolation of compute environments with AWS Lambda

Lambda functions are executed on behalf of the customer in accounts dedicated to the Lambda service. When a Lambda function is invoked, and where an execution environment has not been allocated to that function already, Lambda creates a new execution environment on a Lambda Worker node. Lambda Worker nodes are bare metal EC2 instances with an operating system that is patched and maintained by the Lambda service team per the guidance described in the Best Practices for Security, Identity, and Compliance.

An execution environment is a collection of resources running in a dedicated micro virtual machine (MVM) on the Lambda Worker node. MVMs are created by Firecracker, an open source virtual machine monitor (VMM) and use Linux’s Kernel-based Virtual Machine (KVM) to create and manage MVMs and so maintain separation between execution environments with the same isolation technology as used in containers.

The resources in an execution environment are those required to support the Lambda function code. This includes the code of the particular function version, AWS Lambda Layers selected for the function version, the chosen function runtime (for example, Java, NodeJS, Python3, etc.) or the function’s custom runtime, a writeable temporary filesystem directory, and a minimal Linux user space based on Amazon Linux 2.

Although the Lambda service guarantees isolation of execution environments it does not isolate invocations of a function version’s execution environment so one call to invoke a Lambda function may leave a state that is visible for the next invocation. This feature allows for performance optimization of frequently running or concurrent Lambda functions but should be considered where there is a possibility that one invocation could impact another or where there are security-sensitive operations. This is discussed in more detail in the Data Protection section of this blog.

Figure 2: Isolation model for AWS Lambda Workers

Lambda Worker nodes have a maximum lifecycle of 14 hours and so execution environments are regularly and gracefully reaped before being allocated to new Lambda Worker nodes.

For more detailed description of the compute environment isolation of Lambda please read this whitepaper.

From a network perspective, Lambda functions always operate inside a VPC owned by the Lambda service. This gives the Lambda function access to AWS services as well as the public internet. Inbound access from the public internet is blocked.

Some AWS services create resources that are only accessible within your customer VPC. To access these resources the Lambda must be configured for access to a VPC. With a VPC configuration the Lambda no longer has access to the public internet by default and internet connectivity, where required, will be by means of connectivity in that VPC. In this way financial services customers can continue to apply their existing perimeter network controls for Lambda functions. In addition, with a VPC configuration, the Lambda function can use VPC Endpoints to enable private communications with supported AWS services.

Data protection with AWS Lambda

Lambda functions have permission to access other AWS resources by means of execution roles. Like any other role this is an AWS principal which grants permissions by virtue of identity policy statements assigned to the role. When developing the policies required by a Lambda function’s execution role it is important to consider the stage of the software development cycle that the Lambda function is operating in as the permissions allow for strict separation of duty across development and operational roles. This is particularly true of the sensitive data that the Lambda function may have to operate on such as the event payload, environment variables and sensitive data required at runtime.

Event payload

Lambda functions are invoked from the Invoke API method and so all parameters to a Lambda function are marshaled via the AWS Lambda service, including the event. Although not placed in CloudTrail logs by the Lambda service, these parameters do have to pass through that service. To ensure that sensitive data is not made available outside of the context of the Lambda function sensitive input parameters should be encrypted before submitting to the Invoke API method. Customers can use their own mechanisms to achieve this but AWS KMS provides a simple and effective way of achieving this goal. The caller requires permission to encrypt using a particular KMS Key and the Lambda function requires permission to decrypt using the same KMS key, granted via the Lambda execution role. Using a symmetric KMS Key would allow for parameters of up to 4096 bytes to be encrypted in this way.

Sensitive environment variables

Environment variables are always encrypted at rest and by default this will be with an AWS managed KMS key in the same account. For more control over environment variable encryption, a customer managed key can be used instead. This allows for greater customer control on the key material, as well as different key rotation and key deletion characteristics. It also allows for finer-grained access control to environment variables by limiting what various actors in the software development lifecycle are able to do. For example, in production environments, a deployment pipeline may be able to encrypt environment variables but only the Lambda functions itself, by way of its execution role, would be able to decrypt them.

Other runtime data

As discussed earlier, Lambda function are strongly isolated from each other and between accounts but with the possibility of data being persisted across invocations of a given function version. This persistence can occur in one of two ways, either via files in the temporary filesystem directory or in globally scoped variables. Developers of Lambda functions must consider this when handling sensitive data. Ensure that sensitive information is constrained to individual invocations of the function by processing it in function handlers or using local variables; do not re-use files in the temporary directory to process un-encrypted sensitive data.

AWS strongly recommends that you never put confidential or sensitive information into tags or other free form fields such as Name fields.

Automating audits with APIs with AWS Lambda

AWS Config

There are several AWS Config rules that can be implemented to ensure compliance with specific configurations. AWS Config monitors the configuration of resources and provides some out-of-the-box rules to alert when resources fall into a non-compliant state. AWS Config is used to track configuration changes to the Lambda functions (including deleted functions), runtime environments, tags, handler name, code size, memory allocation, timeout settings, and concurrency settings, along with Lambda IAM execution role, subnet, and security group associations. This gives you a holistic view of the Lambda function’s lifecycle and enables you to surface that data for potential audit and compliance requirements.

AWS Config has five Lambda managed config rules out of the box helping you to quickly set up your compliance rules. The first is lambda-concurrency-check, which checks whether the AWS Lambda function is configured with function-level concurrent execution limit. The second is lambda-dlq-check, which validates whether an AWS Lambda function is configured with a dead-letter queue. The third is lambda-function-public-access-prohibited which proves whether the Lambda function policy prohibits public access. The fourth is lambda-function-settings-check, which validates that the AWS Lambda function settings for runtime, role, timeout, and memory size match the expected values. The last one is lambda-inside-vpc, which checks whether an AWS Lambda function is in an Amazon Virtual Private Cloud.

If a resource violates the conditions of a rule, AWS Config flags the resource and the rule as noncompliant. When the compliance status of a resource changes, AWS Config sends a notification to your Amazon SNS topic.

AWS CloudTrail

With AWS CloudTrail, you can implement governance, compliance, operational auditing, and risk auditing of your entire AWS account, including Lambda. CloudTrail enables you to log, continuously monitor, and retain account activity related to actions across your AWS infrastructure, providing a complete event history of actions taken through the AWS Management Console, AWS SDKs, command line tools, and other AWS services.

When it comes to Lambda, there are a few key APIs that should be monitored to ensure only approved functions have been created. For Lambda functions:

CreateFunction API can be used to create a new function.
DeleteFunction API can be used to delete an existing function.
CreateEventSourceMapping API to create a mapping between an event source such as Amazon Kinesis or Amazon SQS and a Lambda function

Once the functions have been created, there are a few key APIs to ensure that they don’t deviate from defined standards:

The AddPermission API is granting an AWS service or another account permission to use the Lambda function. Monitoring the API is crucial to ensure that only accounts and services that are intended to have access have it.
The UpdateEventSourceMapping API allows updating the existing event source mapping. Any changes in the existing mapping can cause the function to stop processing the original data sources which will then lead to a failure of the data processing.
Monitoring the UpdateFunctionConfiguration API to validate that no unintended modifications to an existing function configuration is key to ensuring compliance to set standards. Changes in the configuration can affect the KMS Key ARN or the timeout of the function.
The UpdateFunctionCode API can be used to update the code of an existing function. Any change in the source code can lead to a non-compliant state of the function and must be closely monitored.

These APIs should all be monitored to ensure that only appropriate actions are being made against your Lambda functions and leveraging CloudTrail can help achieve this goal.

For a complete list of Lambda APIs, not only related to function, review the AWS Lambda API reference.

Here is an example of what a CloudTrail log looks like for the GetFunction and DeleteFunction actions:

{
  "Records": [
    {
      "eventVersion": "1.03",
      "userIdentity": {
        "type": "IAMUser",
        "principalId": "A1B2C3D4E5F6G7EXAMPLE",
        "arn": "arn:aws:iam::999999999999:user/myUserName",
        "accountId": "999999999999",
        "accessKeyId": "AKIAIOSFODNN7EXAMPLE",
        "userName": "myUserName"
      },
      "eventTime": "2015-03-18T19:03:36Z",
      "eventSource": "lambda.amazonaws.com",
      "eventName": "GetFunction",
      "awsRegion": "us-east-1",
      "sourceIPAddress": "127.0.0.1",
      "userAgent": "Python-httplib2/0.8 (gzip)",
      "errorCode": "AccessDenied",
      "errorMessage": "User: arn:aws:iam::999999999999:user/myUserName is not authorized to perform: lambda:GetFunction on resource: arn:aws:lambda:us-west-2:999999999999:function:other-acct-function",
      "requestParameters": null,
      "responseElements": null,
      "requestID": "7aebcd0f-cda1-11e4-aaa2-e356da31e4ff",
      "eventID": "e92a3e85-8ecd-4d23-8074-843aabfe89bf",
      "eventType": "AwsApiCall",
      "recipientAccountId": "999999999999"
    },
    {
      "eventVersion": "1.03",
      "userIdentity": {
        "type": "IAMUser",
        "principalId": "A1B2C3D4E5F6G7EXAMPLE",
        "arn": "arn:aws:iam::999999999999:user/myUserName",
        "accountId": "999999999999",
        "accessKeyId": "AKIAIOSFODNN7EXAMPLE",
        "userName": "myUserName"
      },
      "eventTime": "2015-03-18T19:04:42Z",
      "eventSource": "lambda.amazonaws.com",
      "eventName": "DeleteFunction",
      "awsRegion": "us-east-1",
      "sourceIPAddress": "127.0.0.1",
      "userAgent": "Python-httplib2/0.8 (gzip)",
      "requestParameters": {
        "functionName": "basic-node-task"
      },
      "responseElements": null,
      "requestID": "a2198ecc-cda1-11e4-aaa2-e356da31e4ff",
      "eventID": "20b84ce5-730f-482e-b2b2-e8fcc87ceb22",
      "eventType": "AwsApiCall",
      "recipientAccountId": "999999999999"
    }
  ]
}

Operational access and security with AWS Lambda

When applying access controls to their Lambda functions, customers need to consider the following areas to ensure least privileged:

Execution role – permissions granted to the Lambda
User policies – permissions associated with the identity of the caller
Resource policies – resource-based permissions associated with Lambda function and layers
Permissions boundaries – a mechanism for safely delegating permission management that places a limit on the maximum permissions a policy can grant

The Lambda service assumes an execution role when your function is invoked. This execution role is provided when you create a function; when a Lambda function is created in the Lambda console, an execution role is created with minimal permissions. Execution roles have been discussed as a means to apply data protection controls, but as the Lambda function is developed you might grant permissions to the execution role beyond what is required. It is important to ensure that before publishing the Lambda function to production environments that the execution role has been granted least privileges. Use IAM Access Analyzer to help identify the required permissions for the IAM execution role policy.

When creating identity-based policy for principals accessing the Lambda function it is important to remember that Lambda functions and Lambda layers can have a version element in their ARNs; for functions the version could be an alias. In this way principals can have different access depending on the function or layer version.

Permissions boundaries are important when creating AWS Lambda applications. The permissions boundary specified in the application template limits the scope of the execution role that the template creates for each of its functions. In this way developers with write access to the application’s Git repository are prevented from escalating the application’s permissions beyond the scope of its own resources or delegating permission management to developers.

Code signing

Customers can get greater assurance that the deployed code is what was intended using code that has been signed by the AWS Signer service. When code signing is enabled for a particular function, only functions and layers that have been signed by the AWS Signer profile configured for that function can be deployed. AWS Signer would form part of the code development process for code that had passed all quality gates and was approved for production release.

Conclusion

In this post, we reviewed AWS Lambda and highlighted key information that can help FSI customers accelerate the approval of the service within these five categories: achieving compliance, data protection, isolation of compute environments, automating audits with APIs, and operational access and security. While not a one-size-fits-all approach, the guidance can be adapted to meet your organization’s security and compliance requirements and provide a consolidated list of key areas for AWS Lambda.

In the meantime, be sure to visit our AWS Industries blog channel and stay tuned for more financial services news and best practices.