AWS for Industries

FSI Services Spotlight: Featuring Amazon EKS

In this edition of the Financial Services Industry (FSI) Service Spotlight monthly blog series, we highlight five key considerations of the Amazon Elastic Kubernetes Service (EKS), including achieving compliance, data protection, isolation of compute environments, audits with APIs, and operational access and security. Each of the five areas will include specific guidance, suggested reference architectures, and technical code that can help streamline service approval of EKS in its environment, which may need to be adapted to your specific use case and environment.

Amazon EKS gives you the flexibility to start, run, and scale Kubernetes applications in the AWS cloud or on-premises. EKS helps you provide highly-available and secure clusters and automates key tasks such as patching, node provisioning, and updates. AWS customers include JPMorgan Chase, who uses EKS to run their risk calculations; Fidelity, who uses EKS for their Platform-as-a-Service; and  Snap, Intuit, GoDaddy, and Autodesk, who trust EKS to run their sensitive and mission critical applications.

EKS runs upstream Kubernetes and is certified Kubernetes conformant for a predictable experience. You can easily migrate any standard Kubernetes application to EKS without needing to refactor your code.

The security and compliance of Amazon EKS is assessed as part of multiple AWS compliance programs. Amazon EKS is compliant with:

AWS leverages the AWS shared responsibility model and customers need to ensure workloads running in the AWS Cloud are using the appropriate security controls to meet their compliance needs and security posture. AWS customers are responsible for their security in the cloud. They control and manage the security of their content, applications, systems, and networks. AWS manages security of the cloud, providing and maintaining proper operations of services and features, protecting AWS infrastructure and services, maintaining operational excellence, and meeting relevant legal and regulatory requirements.

Amazon EKS has several different deployment methodologies where the level of shared responsibility in the Kubernetes stack shifts to AWS as the Kubernetes cluster become more managed. AWS Fargate with EKS, for example, would shift more management responsibility, such as worker node OS management, to AWS. This blog will focus on EKS with self-managed worker nodes, which provide customers with the most flexibility and ability to manage lower level components in their Kubernetes clusters. The shared responsibility model, when running EKS with self-managed worker nodes, is outlined in the following diagram:

Figure 1: Shared responsibility model when running EKS with self-managed worker nodes

The EKS control plane is located in an AWS managed Virtual Private Cloud (VPC) (the right side of the diagram). Amazon EKS runs a dedicated (single-tenant) Kubernetes control plane for each cluster. The control plane infrastructure is not shared across clusters or AWS accounts. The control plane consists of at least two API server instances and three etcd instances that run across three Availability Zones within a Region. The customer is responsible for managing the Kubernetes role-based access control (RBAC) policies on the EKS clusters.

Data Protection with EKS

Encryption is a commonly used mechanism to protect data in transit and at rest. Customers should consider encryption a best practice for all workloads whether they are highly sensitive or not.

Data in Transit

Amazon EKS clusters’ Kubernetes management API is available via Transport Layer Security (TLS) versions 1.2 and higher only. When managing EKS clusters through AWS API endpoints, encryption of data in transit is provided by TLS 1.0 or higher. We recommend using a minimum of TLS 1.2, which is configurable within your TLS clients.

EKS clusters integrate with Network Load Balancers (NLB) to provide encryption of data in transit for all communication with the applications deployed onto EKS from outside the cluster. Regulated workloads demand end-to-end encryption for data in transit in the entire application stack and TLS can be used for encryption of data in transit between Kubernetes ingress controllers and the pods they target. Customers can use service mesh communication technology like AWS AppMesh to ensure services are encrypted in transit between EKS services running on the same or different EKS clusters using mutual TLS (mTLS), without the need to add encryption into your applications code or configuration.

Data at Rest

AWS Key Management Service (AWS KMS) makes it easy for you to accomplish data at rest by creating and managing cryptographic keys and controlling their use across a wide range of AWS services and in your applications. We recommend customers use AWS KMS in EKS environments to provide a defense in depth style approach to security. AWS KMS supports CloudTrail for audibility and can automate yearly key rotation with a customer managed AWS KMS key (formerly known as CMK) to meet compliance requirements.

Amazon EKS allows customers to use AWS KMS to encrypt Kubernetes secrets stored within your Kubernetes clusters to help protect them from unauthorized access by using envelope encryption with your KMS keys. AWS KMS may also be used to encrypt the Elastic Block Storage (EBS) volumes, Elastic File System (EFS), and FSx for Lustre filesystems used by EKS clusters. Container images can be encrypted and stored in Amazon Elastic Container Registry (ECR).

Isolation of compute environments with EKS

Amazon EKS service API

EKS service APIs are accessible over the internet in regionally specific URLs. The EKS service API is used for actions like creating/deleting EKS clusters and scheduling cluster version upgrades. The EKS service API permissions use AWS IAM for access management.

The aws eks commands from the AWS CLI and the eksctl tool call the Amazon EKS service API. kubectl commands call the Kubernetes cluster management API.

Amazon EKS has two compute components: Control plane nodes and Worker nodes.

Control plane nodes

Kubernetes control plane nodes are managed by AWS and are hosted in an Amazon Virtual Private Cloud (Amazon VPC)   that are protected by the AWS global network security procedures described in the Introduction to AWS Security whitepaper.

Customers can choose to expose the cluster management API endpoint of the EKS cluster over the internet with an allow list of CIDR blocks to only allow connections from approved network locations, or with a private IP address inside of a customer managed Amazon VPC. When configured to use a VPC, customers can use security groups (SGs) and network access control lists (NACLS) to control connectivity to the EKS cluster endpoint. EKS clusters can be deployed with both an internet facing and a private VPC API endpoint.

Private EKS cluster API endpoints can be accessed from on-premises via AWS Direct Connect or AWS VPN Gateway, or from an Amazon VPC via Amazon EC2 Bastion Host or AWS Cloud9 IDE.

Worker nodes require access to the EKS cluster endpoint to communicate with the Kubernetes control plane. We recommend deploying the EKS cluster API with an in-vpc private API endpoint as that simplifies the connectivity. See this blog for details on various ways to configure the Amazon VPC to run EC2 worker nodes for your Kubernetes cluster managed by Amazon EKS.

Worker nodes

EKS worker nodes run in a customer managed Amazon VPC where the customer has full control over the network. The customer has control over what their workloads running on their EKS worker nodes can access over the network. Customers can control what can be accessed using Amazon VPC network security controls such as SGs, NACLs, and routes.

Customers have full control over traffic between pods running on their EKS nodes via Kubernetes configuration. In Kubernetes, Pod to Pod traffic is open by default. Following is an example network policy configuration which restricts all communication between pods.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
  namespace: default
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

Amazon VPC Container Network Interface (CNI) plugin for Kubernetes assigns an IP address from your Amazon VPC to each pod. CNI allows for better separation of worker nodes and pod traffic by configuring a second ENI in a different VPC CIDRs. Kubernetes network policies operate at layers 3 and 4 of the OSI model. Network policies use pod selectors and labels to identify source and destination pods, but can also include IP addresses, port numbers, protocol number, or a combination of these. Policies should be designed to restrict network traffic only to what is required by the workloads and pods running. Customers can use Calico, which is an open source policy engine from Tigera that works well with EKS and can provide layer 3 – 7 network security when operated with Istio. Network Policy such as deny from other namespaces provide a layer of isolation between workloads running on the same cluster.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  namespace: default
  name: deny-from-other-namespaces
spec:
  podSelector:
    matchLabels:
  ingress:
  - from:
    - podSelector: {}

Customers can also leverage Security Groups on Pods on Nitro enabled EC2 instances to grant access to other AWS resources through IAM Roles for Service Accounts (IRSA). IRSA provides fine-grained roles at the pod level rather than the node level using AWS IAM by way of OpenID Connect (OIDC). By default, the Amazon EC2 instance metadata service (IMDS) provides the credentials assigned to the node IAM role to the instance, and any pod running on the instance. It is a best practice to remove access to these credentials from pods and provide pods with least privilege access through network policies.

At the pod level example:

apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
    name: restrictive
spec:
    privileged: false
    allowPrivilegeEscalation: false
    hostnetwork: false

Note: Pod Security Policy (PSP) is deprecated as of Kubernetes 1.21, and is being removed in Kubernetes 1.25.

AWS recommends that customers implement policy as code to place guardrails and improve maintainability on network and security policies as a best practice. In planning for the future, customers should evaluate replacements like Gatekeeper to replace Pod Security Policy (PSP).

Automating audits with APIs with EKS

AWS Config monitors the configuration of resources and allows customers to monitor their AWS resources for compliance to their security requirements. AWS config comes with rules for EKS built-in, and also allows customers to provide their own custom config rules. We recommend customers enable the built-in config rules to make sure their clusters are not publicly accessible and that Kubernetes secrets are encrypted using KMS keys.

Calls to the Amazon EKS API are logged in AWS Cloudtrail create a trail to record activity for a specified amount of time to meet compliance requirements. Cloudtrail will log all management events for EKS that are performed through the AWS API endpoint such as (but not limited to) CreateCluster, UpdateClusterConfig, and DescribeCluster. Every event or log entry contains information about who generated the request.

{
    "eventVersion": "1.08",
    "userIdentity": {
        "type": "AssumedRole",
        "principalId": "AROAUGEXAMPLERXKAXXLD:liwadman-Isengard",
        "arn": "arn:aws:sts::111122223333:assumed-role/ExampleRole/Example",
        "accountId": "111122223333",
        "accessKeyId": "ASIAUGWEXAMPLEALVHGE",
        "sessionContext": {
            "sessionIssuer": {
                "type": "Role",
                "principalId": "AROAUGEXAMPLERXKAXXLD",
                "arn": "arn:aws:iam::111122223333:role/ExampleRole",
                "accountId": "289260094723",
                "userName": "ExampleAdmin"
            },
            "webIdFederationData": {},
            "attributes": {
                "mfaAuthenticated": "false",
                "creationDate": "2021-06-11T21:29:19Z"
            }
        }
    },
    "eventTime": "2021-06-11T21:33:24Z",
    "eventSource": "eks.amazonaws.com",
    "eventName": "CreateCluster",
    "awsRegion": "us-west-2",
    "sourceIPAddress": "111.222.122.223",
    "userAgent": "aws-internal/3 aws-sdk-java/1.11.1030 Linux/5.4.109-57.182.amzn2int.x86_64 OpenJDK_64-Bit_Server_VM/25.292-b10 java/1.8.0_292 vendor/Oracle_Corporation cfg/retry-mode/legacy",
    "requestParameters": {
        "resourcesVpcConfig": {
            "subnetIds": [
                "subnet-cEXAMPLE",
                "subnet-bEXAMPLE",
                "subnet-eEXAMPLE",
                "subnet-aEXAMPLE"
            ],
            "securityGroupIds": [],
            "endpointPublicAccess": true,
            "endpointPrivateAccess": false
        },
        "clientRequestToken": "3Example-3a1b-4a26-881a-899f72a516db",
        "roleArn": "arn:aws:iam::111122223333:role/EKSClusterRole",
        "name": "ExampleCluster",
        "logging": {
            "clusterLogging": [
                {
                    "enabled": true,
                    "types": [
                        "api",
                        "audit",
                        "authenticator",
                        "controllerManager",
                        "scheduler"
                    ]
                },
                {
                    "enabled": false,
                    "types": []
                }
            ]
        },
        "version": "1.19",
        "tags": {}
    },
    "responseElements": {
        "cluster": {
            "name": "ExampleCluster",
            "arn": "arn:aws:eks:us-west-2:111122223333:cluster/ExampleCluster",
            "createdAt": 1623447204.24,
            "version": "1.19",
            "roleArn": "arn:aws:iam::111122223333:role/EKSClusterRole",
            "resourcesVpcConfig": {
                "subnetIds": [
                    "subnet-c90cab83",
                    "subnet-bb9712c3",
                    "subnet-ec6df9b1",
                    "subnet-6afaf641"
                ],
                "securityGroupIds": [],
                "vpcId": "vpc-4Example",
                "endpointPublicAccess": true,
                "endpointPrivateAccess": false,
                "publicAccessCidrs": [
                    "1.2.3.4/32"
                ]
            },
            "kubernetesNetworkConfig": {
                "serviceIpv4Cidr": "10.100.0.0/16"
            },
            "logging": {
                "clusterLogging": [
                    {
                        "types": [
                            "api",
                            "audit",
                            "authenticator",
                            "controllerManager",
                            "scheduler"
                        ],
                        "enabled": true
                    }
                ]
            },
            "status": "CREATING",
            "certificateAuthority": {},
            "platformVersion": "eks.5",
            "tags": {}
        }
    },
    "requestID": "aEXAMPLE-5dc3-4da8-aa73-c32537068e47",
    "eventID": "5EXAMPLE-a9ec-43d8-9cfd-e859d1b7b254",
    "readOnly": false,
    "eventType": "AwsApiCall",
    "managementEvent": true,
    "eventCategory": "Management",
    "recipientAccountId": "111122223333"
}

Customers can specify which logs from EKS clusters are sent to AWS Cloudwatch where they can be viewed or downloaded. AWS recommends that customers enable logging on all EKS clusters control plane.

Customers can use the following CLI command to turn on all cluster logging

aws eks update-cluster-config \
--region <region-code> \
--name <prod> \
--logging '{"clusterLogging":[{"types":["api","audit","authenticator","controllerManager","scheduler"],"enabled":true}]}'

Logs sent to CloudWatch include:

AWS Audit Manager allows customers to simplify auditing their AWS environments. Audit manager automates the collection of evidence for audits and can capture relevant Cloud Trail logs and AWS config history of EKS clusters for use in audits. AWS Audit Manager could be used to capture all the Cloud Trail logs for the ‘createCluster’ and ‘updateClusterConfig’ actions to show the configuration history of your EKS clusters. This may be useful when attempting to prove compliance over time with PCI DSS control 1.3 to ensure that card holder systems are restricted from the internet.

The following screenshots illustrate the configuration of a custom control’s data source for a custom compliance framework in AWS Audit Manager. This control records Amazon EKS ‘createCluster’ and ‘updateClusterConfig’ actions which could prove that the EKS cluster control plane has never been configured to accept connections directly from the internet.

Operational access and security with EKS

AWS customers in the financial services industry require detailed logging and assurance of access of critical data. Customers can review third-party auditor reports such as the AWS SOC 2 Type II report, ISO 27001, and others in AWS Artifact.

Control plane access

Amazon EKS uses IAM to provide authentication to your Kubernetes cluster but it still relies on native Kubernetes Role Based Access Control (RBAC) for authorization. This means that IAM is only used for authentication of valid IAM entities. All permissions for interacting with your Amazon EKS cluster’s Kubernetes API is managed through the native Kubernetes RBAC system.

When EKS clusters are created, the role or user that was used to create a cluster is granted the system:masters entitlement which provides full administrative access to the Kubernetes cluster. Customers have full control of their EKS cluster configurations and can specify which IAM roles or users have administrative access to their EKS clusters with Kubernetes configurations. The following example grants the system:masters entitlement to the IAM roles listed:

apiVersion: v1
data:
  mapRoles: |
    - userarn: <arn:aws:iam::111122223333:role/exampleCICDRole>
      username: <CICD>
      groups:
        - <system:masters>
    - userarn: <arn:aws:iam::111122223333:role/exampleops-user>
      username: <ops-user>
      groups:
        - <system:masters>

AWS recommends creating the cluster with a dedicated IAM role and regularly audit who can assume this role. This role should not be used to perform routine actions on the cluster and won’t appear in the ConfigMap. Instead, additional users should be granted access to the cluster through the aws-auth ConfigMap for this purpose. After the aws-auth ConfigMap is configured, the role can be deleted and only recreated in an emergency / break glass scenario. This can be particularly useful in production clusters in reducing administrative access. Customers can leverage an existing identity provider and identity management lifecycle through an OIDC configuration to manage user access to a cluster, reducing AWS IAM needs.

Worker nodes access

Customers can also restrict the Amazon EKS worker nodes to the instance types that are built on The AWS Nitro System. Nitro System is a collection of AWS-built hardware and software components that enable high performance, high availability, and high security. The Nitro System’s security model is locked down and prohibits interactive access, reducing the possibility of human error and tampering. The list of EC2 instances that are built on the Nitro System can be found here.

Conclusion

In this post, we reviewed Amazon EKS and highlighted key information that can help FSI customers accelerate the approval of the service within these five categories: achieving compliance, data protection, isolation of compute environments, automating audits with APIs, and operational access and security. While not a one-size-fits-all approach, the guidance provided can be adapted to meet your organization’s security and compliance requirements and provide a consolidated list of key areas to focus on for Amazon EKS.

In the meantime, be sure to visit our AWS Industries blog channel and stay tuned for more financial services news and best practices.

Tim Murphy

Tim Murphy

Tim Murphy is a Senior Solutions Architect for AWS, working with enterprise customers in various industries to build business based solutions in the cloud. He has spent the last decade working with startups, non-profits, commercial enterprise, and government agencies, deploying infrastructure at scale. In his spare time when he isn't tinkering with technology, you'll most likely find him in far flung areas of the earth hiking mountains, surfing waves, or biking through a new city.

Liam Wadman

Liam Wadman

Liam Wadman is a Security Solutions Architect based in Vancouver. He works with large financial institutions to create secure architectures in AWS. Liam is often found mountain biking when he is not reading IETF RFCs.