Containers
Implementing Runtime security in Amazon EKS using CNCF Falco
Many organisations are in the process of migrating their applications to containers. Containers provide application-level dependency management, speedy launches, and support immutability. This can help reduce costs, increase velocity, and improve on efficiency. For securely managing the container lifecycle, container image hardening, and end-to-end security checks are critical factors. Containers need to be secured by default before the containers are deployed into a container orchestrator, such as Amazon Elastic Kubernetes Service (Amazon EKS). The journey of hardening containers begins as follows:
- Lint your Dockerfile.
- Build the image with the linted Dockerfile or Docker Compose file.
- Perform static container image scanning.
- Verify the vulnerabilities.
- Have a manual approval process.
- Deploy to the orchestrator, Amazon ECS or Amazon EKS.
- Enable dynamic image scanning on Containers and analyse the logs regularly.
Let’s first cover what is a static scan and a dynamic scan to better understand the pipeline flow:
- Static scan is a type of deep scanning of the container layers before they are used or deployed. The container is scanned against the public bug or CVE databases.
- Dynamic scan is a type of deep scanning of the container layers after or while they are running or deployed. This methodology can scan and publish the results as required or analyze the logs continuously while the container is running. There are multiple products available in the market that fall under dynamic scanning tools, such as CNCF Falco, Twistlock, and Aqua.
To learn more about this topic, check out this post on Container DevSecOps on Amazon ECS Fargate with AWS CodePipeline. In this post, we show you how you can build, install, and use runtime security with CNCF Falco on Amazon EKS. The demo utilizes a static scan methodology and performs a deep container scan for any vulnerabilities or issues and finally deploys them to Amazon EKS.
We will be using the following AWS services and open source tools for this post:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Amazon CloudWatch
- Firelens
- AWS CloudFormation
- AWS CLI
- CNCF Falco
- falcosecurity/falco
Set up your Amazon EKS cluster
Before we set up an Amazon EKS cluster, please set up the tools mentioned below on your system. Detailed information on how to setup with respect to operating system types has been provided in the links:
Create a sample (check below) Amazon EKS cluster configuration file called cluster-config.yaml. This configuration file will be used as configuration file to deploy the Amazon EKS cluster. We can deploy the Amazon EKS cluster to an existing or new VPC. I have used an existing VPC to set up the cluster with managed nodes groups in both public and private subnets.
You can go to eksctl.io page to get many examples of ClusterConfig samples. Below is one of the configuration files you can use. In this demo, we show how to build the cluster with pre-existing resources. Visit the eksctl docs to understand the ClusterConfig schema elements.
cluster-config.yaml file:
apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: eks-managed-cluster region: ap-south-1 vpc: id: "vpc-xxxxxxxxxx" # Provide the VPC ID cidr: "xxxxxxxxxxxx" # Provide the VPC CIDR Range subnets: public: ap-south-1a: id: "subnet-xxxxxxxx" # Provide the Subnet ID cidr: "xxxxxxxxxxxx" # Provide the Subnet CIDR Range ap-south-1b: id: "subnet-xxxxxxxx" # Provide the Subnet ID cidr: "xxxxxxxxxxxx" # Provide the Subnet CIDR Range # Provide the service role for EKS cluster #iam: # serviceRoleARN: "arn:aws:iam::11111:role/eks-base-service-role" # Below schema elements build Non-EKS managed node groups #nodeGroups: # - name: ng-1 # instanceType: m5.large # desiredCapacity: 3 # iam: # instanceProfileARN: "arn:aws:iam::11111:instance-profile/eks-nodes-base-role" # instanceRoleARN: "arn:aws:iam::1111:role/eks-nodes-base-role" # privateNetworking: true # securityGroups: # withShared: true # withLocal: true # attachIDs: ['sg-xxxxxx', 'sg-xxxxxx'] # ssh: # publicKeyName: 'my-instance-public-key' # tags: # 'environment:basedomain': 'example.org' # Below schema elements build EKS managed node groups managedNodeGroups: - name: eks-managed-ng-1 # Provide the name of the node group minSize: 1 # Autoscaling Group configuration maxSize: 2 # Autoscaling Group configuration instanceType: t2.small # Size and type of the worker nodes desiredCapacity: 1 # Autoscaling Group configuration volumeSize: 20 # Worker Node volume size ssh: allow: true # You can use the provided public key to logon to the containers. publicKeyPath: ~/.ssh/id_rsa.pub # sourceSecurityGroupIds: ["sg-xxxxxxxxxxx"]. # OPTIONAL labels: {role: worker} tags: nodegroup-role: worker iam: withAddonPolicies: externalDNS: true certManager: true # provide the role ARN to be atatched to instances # iam: # instanceRoleARN: "arn:aws:iam::1111:role/eks-nodes-base-role"
Run the below command to create the Amazon EKS cluster.
eksctl create cluster -f cluster-config.yaml
You can find the cluster created by Amazon CloudFormation as shown below:
You can go to the Amazon EKS page on the AWS Management Console and check the status of the cluster creation as below:
Set up a sample deployment on Amazon EKS Cluster
Create a new configuration called deployment.yaml for your sample application. We will use a sample Nginx website pods on the public subnets that we have provided in the cluster configuration file cluster-config.yaml.
Check the below sample deployment.yaml file.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx2
labels:
app: nginx2
spec:
replicas: 3
selector:
matchLabels:
app: nginx2
template:
metadata:
labels:
app: nginx2
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/arch
operator: In
values:
- amd64
- arm64
containers:
- name: nginx
image: nginx:1.19.2
ports:
- containerPort: 80
Now deploy Nginx as shown below:
kubectl apply -f deployment.yaml
You can verify the deployment status with kubectl command.
kubectl get deployments --all-namespaces
You should see the following output.
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
default nginx2 3/3 3 3 46d
kube-system coredns 2/2 2 2 46d
Set up a Falco runtime security
We will install the well known runtime security tool CNCF Falco for deep container security events analysis and alerts. Falco works in conjunction other AWS services such as Firelens and Amazon CloudWatch. Firelens is log aggregator product which has the ability to collect and send the container logs to many services like inside Amazon ecosystem like Amazon CloudWatch for further analysis and alerting mechanisms. Firelens uses Fluent Bit or Fluentd behind the scenes and supports all features and configurations of both of the products. We can even send the AWS Firelens logs output to any external logging and analytics services as well.
Amazon CloudWatch is a deep monitoring, alerting and analytics service of Amazon which provides lot of insights on the services from which logs are received from. We can create custom dashboard metrics, alerting and insights on the logs. Please check the documentation here on Amazon CloudWatch. Falco specifically uses Firelens and Amazon CloudWatch as below as explained in the blog :
1. Falco continuously scans the containers running in the pods and sends the security, debug, or audit events as JSON format as STDOUT.
2. Firelens then collects the JSON log file and processes the logs files as per Fluent Bit configuration files.
3. Post log transformation by Fluent Bit containers, the logs are finally sent to AWS CloudWatch as the final destination.
This blog explains in depth on how to install Falco and how to it works with other AWS services.
Clone the Falco repository
Git clone this repository: https://github.com/sysdiglabs/falco-aws-firelens-integration
Now go to the directory, eks/fluent-bit, and you will find two directories called aws and kubernetes
aws – This is the directory that has the IAM policy called iam_role_policy.json, which we will attach to the worker node VM’s role, which is automatically attached to the worker nodes when we create or deploy an EKS cluster. This policy will give Falco running on the worker nodes to send/stream logs to Amazon CloudWatch.
Kubernetes – This directory has three files: configmap.yaml, daemonset.yaml, and service-account.yaml. These files will be applied to create a ConfigMap for Fluent Bit configuration, a Fluent Bit DaemonSet to run on all worker nodes, and finally a service account for the RBAC cluster role for authorization. All the files will be applied all at once.
This Falco blog explains the same on how to install standard Falco. We will attach the IAM policy to the node instances to give them permissions to stream logs to Amazon CloudWatch as below –
Set up the Falco with IAM permissions
aws iam create-policy --policy-name EKS-CloudWatchLogs --policy-document file://./fluent-bit/aws/iam_role_policy.json
This creates a policy called EKS-CloudWatchLogs with privileges to send logs to Amazon CloudWatch.
aws iam attach-role-policy --role-name <EKS-NODE-ROLE-NAME> --policy-arn `aws iam list-policies | jq -r '.[][] | select(.PolicyName == "EKS-CloudWatchLogs") | .Arn'`
NOTE: “EKS-NODE-ROLE-NAME” is the role that is attached to the worker nodes. You can find the role attached. For example in my case after setting up the EKS cluster, I see eksctl-eks-managed-cluster-nodegr-NodeInstanceRole-1T0251NJ7YV04 is the role attached the node.
aws iam attach-role-policy --role-name eksctl-eks-managed-cluster-nodegr-NodeInstanceRole-1T0251NJ7YV04 --policy-arn `aws iam list-policies | jq -r '.[][] | select(.PolicyName == "EKS-CloudWatchLogs") | .Arn'`
Finally, apply the whole directory and all the listed configuration files (Configmap.yaml, daemonset.yaml and service-account.yaml) will be applied.
Set up the Falco Helm repository
Clone the falcosecurity/falco Helm chart repository as below and add the helm chart.
git clone https://github.com/falcosecurity/charts.git; helm repo add falcosecurity https://falcosecurity.github.io/charts
Helm repo update
Go to falco/rules directory and check the default rules configuration files, which are shipped and can be readily applied. Please go through the default rule-set yaml files, which have detailed explanations of the rules specified in each of them. We can add our custom rules as well.
1. application_rules.yaml
2. falco_rules.local.yaml
3. falco_rules.yaml
4. k8s_audit_rules.yaml
Falco behavior can be controlled by a configuration parameters, which can be supplied as runtime parameters while installing the chart or by creating a special purpose file, for example values.yaml (you can give any name). Check out this page to understand all the configuration parameters, which control the run time behavior of Falco audit level, log level, file outputs etc.
sample values.yaml is here below for your reference.
NOTE: The jsonOutput
property is false
in values.yaml by default. Set to true
for json formatted output via fluent-bit.
https://github.com/falcosecurity/charts/blob/master/falco/values.yaml
Finally install the Helm chart.
helm install falco -f values.yaml falcosecurity/falco
You should see the following output:
Once this deployment is completed, Falco will be scanning our Kubernetes cluster pods for security or suspicious events behavior and sending the log events to Firelens, which transforms the JSON logs as per the configuration settings specified and finally sends the logs to CloudWatch.
At the end, we will have the following number of pods and deployments.
You should see the following output:
In the Console, go to CloudWatch to find the Log group streams.
Simulating and Testing
Falco will catch any suspicious activity on the pods. We can test this by going inside any of the test deployments (in our case its NGINX application pods) and simulate certain commands to trigger default rules implemented by Falco.
Example 1: Simulating rules in falco_rules.yaml : Go inside any of the NGINX pods and execute the following statements, which will trigger the rule called “ write below etc” and “Read sensitive file untrusted”
You can use the above listed command (kubectl get pods) to list all the pods running on your EKS cluster and go inside (SSH) one of the NGINX pods and simulate the below suspicious activities. For example, we can SSH into one NGINX pod nginx2-7844999d9c-wdpz8 and simulate some actions which Falco will catch based on the rule sets we have created.
Falco is enabled to catch all suspicious activities on all the pods on the whole cluster instances.
and generate the activity like below:
touch /etc/2
cat /etc/shadow > /dev/null 2>&1
Falco will generate the alerts on CloudWatch as shown below on the test simulation.
Example 2: Go inside any of the Nginx pod and execute below statements, which will trigger the rule called “Mkdir binary dirs.”
and generate the activity like below:
cd /bin
mkdir hello
Falco will generate the alerts on CloudWatch as below on the test simulation.
Creating Custom Rules
I will create a YAML file with sample custom rules or append the rules to the existing default rules set. In this demo, I’m going to create a new custom rule set file called custom_alerts.yaml and put the desired rules conditions. In this example, I’m creating an alert for simple commands like whoami & who. When these commands are executed inside the NGINX container, Falco will alert us.
Sample custom_alerts.yaml file:
Finally upgrade the Helm chart with new configuration file to be added to the security alerting.
You should see the following output:
and generate the activity like below:
whoami
find
Falco will generate the following alerts on CloudWatch on the test simulation.
Likewise you can create as many custom rules based on the application or system requirement. Please check the detailed explanation on creating custom rules here.
Creating Custom Amazon CloudWatch Insights
Go to Amazon CloudWatch Insights in the AWS Console and create custom insights and dashboards as needed. In this demo, I’m going to create custom dashboard with the name falco and add with two Insights based on the rules we have simulated so far.
Create two insights for the rules “Mkdir binary dirs” and “Read sensitive file untrusted,” which get the logs for the last three hours and matches any message in the logs as “Mkdir binary dirs” and creates a dashboard for the same.
Finally, the dashboard Falco should look like the following screenshot as a sample.
Creating custom Amazon CloudWatch alarms
Create an Amazon SNS topic and subscription for email with the appropriate permissions to send email alerts
Go to Amazon CloudWatch in the AWS Console and create an alarm as follows.
Select the Amazon CloudWatch Log Group name for setting LogGroupName and choose an appropriate value for Statistic. For this demo, I have chosen Statistic as Sum with the threshold the of alerts as one, which means the Amazon CloudWatch Alarm will alert you with an email when Falco sends at least one alert to Amazon CloudWatch. This will alert for all alerts with minimum alert threshold as one alert.
Overall the alarm should look like the following screenshot:
The CloudWatch alarm will send email alerts as below your email address mentioned:
Conclusion
In this post, I have demonstrated how you can set up an Amazon EKS cluster with a sample NGINX website and configure runtime container security analysis and alerting with CNCF Falco and Amazon CloudWatch using custom dashboards and alarms. CNCF Falco can be configured for stream any custom logs as well as standard alerts. Please check the AWS EKS security for latest updates and EKS security best practises GitHub page if you would like to suggest new features or check the latest roadmaps by the team.