如何 Amazon CloudWatch Logs 中擷取 Amazon EKS 控制平面日誌?

上次更新日期:2021 年 11 月 12 日

我正在排查 Amazon Elastic Kubernetes Service (Amazon EKS) 問題,並且需要從 EKS 控制平面上執行的元件收集日誌。

簡短描述

若要檢視 Amazon CloudWatch Logs 中的日誌,您必須啟用 Amazon EKS 控制平面日誌記錄。您可以在 /aws/eks/cluster-name/cluster 日誌群組中找到 EKS 控制平面日誌。如需詳細資訊,請參閱檢視叢集控制平面日誌

注意:叢集名稱取代為您的叢集名稱。

您可以使用 CloudWatch Logs Insights 來搜尋 EKS 控制平面日誌資料。如需詳細資訊,請參閱 使用 CloudWatch Insights 分析日誌資料

重要事項:只有在叢集中啟用控制平面日誌記錄之後,才能在 CloudWatch Logs 中檢視日誌事件。在 CloudWatch Logs Insights 中選取要執行查詢的時間範圍之前,請確認您已啟用控制平面日誌記錄。

解決方案

搜尋 CloudWatch Insights

1.    開啟 CloudWatch Logs Insights 主控台。

2.    在日誌群組選單上,選取要查詢的叢集日誌群組

3.    選擇執行查詢以檢視結果。

注意:若要將結果匯出為 .csv 檔案,或將結果複製到剪貼簿,請選擇匯出結果。您可以變更範例查詢以取得特定使用案例的資料。請參閱下列常見 EKS 使用案例的查詢範例。

常見 EKS 使用案例的查詢範例

若要尋找叢集建立者,請搜尋對應至 kubernetes-admin 使用者的 IAM 實體。

查詢:

fields @logStream, @timestamp, @message
|  @timestamp desc
|  @logStream like /authenticator/
|  @message like "username=kubernetes-admin"
|  limit 50

範例輸出:

@logStream, @timestamp @message
authenticator-71976 ca11bea5d3083393f7d32dab75b,2021-08-11-10:09:49.020,"time=""2021-08-11T10:09:43Z"" level=info msg=""access granted"" arn=""arn:aws:iam::12345678910:user/awscli"" client=""127.0.0.1:51326"" groups=""[system:masters]"" method=POST path=/authenticate sts=sts.eu-west-1.amazonaws.com uid=""heptio-authenticator-aws:12345678910:ABCDEFGHIJKLMNOP"" username=kubernetes-admin"

在前面的輸出中,IAM 使用者 arn:aws:iam::12345678910:user/awscli 對應至使用者 kubernetes-admin

若要尋找特定使用者所執行的要求,請搜尋 kubernetes-admin 使用者執行的作業。

查詢範例:

fields @logStream, @timestamp, @message
| filter @logStream like /^kube-apiserver-audit/
| filter strcontains(user.username,"kubernetes-admin")
| sort @timestamp desc
| limit 50

範例輸出:

@logStream,@timestamp,@message
kube-apiserver-audit-71976ca11bea5d3083393f7d32dab75b,2021-08-11 09:29:13.095,"{...""requestURI"":""/api/v1/namespaces/kube-system/endpoints?limit=500";","string""verb"":""list"",""user"":{""username"":""kubernetes-admin"",""uid"":""heptio-authenticator-aws:12345678910:ABCDEFGHIJKLMNOP"",""groups"":[""system:masters"",""system:authenticated""],""extra"":{""accessKeyId"":[""ABCDEFGHIJKLMNOP""],""arn"":[""arn:aws:iam::12345678910:user/awscli""],""canonicalArn"":[""arn:aws:iam::12345678910:user/awscli""],""sessionName"":[""""]}},""sourceIPs"":[""12.34.56.78""],""userAgent"":""kubectl/v1.22.0 (darwin/amd64) kubernetes/c2b5237"",""objectRef"":{""resource"":""endpoints"",""namespace"":""kube-system"",""apiVersion"":""v1""}...}"

若要尋找特定 userAgent 所發出的 API 呼叫,您可以使用下列範例查詢:

fields @logStream, @timestamp, userAgent, verb, requestURI, @message
| @logStream like /kube-apiserver-audit/
| userAgent like /kubectl\/v1.22.0/
| sort @timestamp desc
| filter verb like /(get)/

縮短的範例輸出:

@logStream,@timestamp,userAgent,verb,requestURI,@message
kube-apiserver-audit-71976ca11bea5d3083393f7d32dab75b,2021-08-11 14:06:47.068,kubectl/v1.22.0 (darwin/amd64) kubernetes/c2b5237,get,/apis/metrics.k8s.io/v1beta1?timeout=32s,"{""kind"":""Event"",""apiVersion"":""audit.k8s.io/v1"",""level"":""Metadata"",""auditID"":""863d9353-61a2-4255-a243-afaeb9183524"",""stage"":""ResponseComplete"",""requestURI"":""/apis/metrics.k8s.io/v1beta1?timeout=32s"",""verb"":""get"",""user"":{""username"":""kubernetes-admin"",""uid"":""heptio-authenticator-aws:12345678910:AIDAUQGC5HFOHXON7M22F"",""groups"":[""system:masters"",""system:authenticated""],""extra"":{""accessKeyId"":[""ABCDEFGHIJKLMNOP""],""arn"":[""arn:aws:iam::12345678910:user/awscli""],""canonicalArn"":[""arn:aws:iam::12345678910:user/awscli""],""sourceIPs"":[""12.34.56.78""],""userAgent"":""kubectl/v1.22.0 (darwin/amd64) kubernetes/c2b5237""...}"

要查找對 aws-auth ConfigMap 所執行的變更,您可以使用以下範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /^kube-apiserver-audit/
| filter requestURI like /\/api\/v1\/namespaces\/kube-system\/configmaps/
| filter objectRef.name = "aws-auth"
| filter verb like /(create|delete|patch)/
| sort @timestamp desc
| limit 50

縮短的範例輸出:

@logStream,@timestamp,@message
kube-apiserver-audit-f01c77ed8078a670a2eb63af6f127163,2021-10-27 05:43:01.850,{""kind"":""Event"",""apiVersion"":""audit.k8s.io/v1"",""level"":""RequestResponse"",""auditID"":""8f9a5a16-f115-4bb8-912f-ee2b1d737ff1"",""stage"":""ResponseComplete"",""requestURI"":""/api/v1/namespaces/kube-system/configmaps/aws-auth?timeout=19s"",""verb"":""patch"",""responseStatus"": {""metadata"": {},""code"": 200 },""requestObject"": {""data"": { contents of aws-auth ConfigMap } },""requestReceivedTimestamp"":""2021-10-27T05:43:01.033516Z"",""stageTimestamp"":""2021-10-27T05:43:01.042364Z"" }

若要尋找拒絕的要求,您可以使用下列範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /^authenticator/
| filter @message like "denied"
| sort @timestamp desc
| limit 50

範例輸出:

@logStream,@timestamp,@message
authenticator-8c0c570ea5676c62c44d98da6189a02b,2021-08-08 20:04:46.282,"time=""2021-08-08T20:04:44Z"" level=warning msg=""access denied"" client=""127.0.0.1:52856"" error=""sts getCallerIdentity failed: error from AWS (expected 200, got 403)"" method=POST path=/authenticate"

若要尋找對 Pod 排程的節點,請查詢 kube-scheduler 日誌。

查詢範例:

fields @logStream, @timestamp, @message
| sort @timestamp desc
| filter @logStream like /kube-scheduler/
| filter @message like "aws-6799fc88d8-jqc2r"
| limit 50

範例輸出:

@logStream,@timestamp,@message
kube-scheduler-bb3ea89d63fd2b9735ba06b144377db6,2021-08-15 12:19:43.000,"I0915 12:19:43.933124       1 scheduler.go:604] ""Successfully bound pod to node"" pod=""kube-system/aws-6799fc88d8-jqc2r"" node=""ip-192-168-66-187.eu-west-1.compute.internal"" evaluatedNodes=3 feasibleNodes=2"

在前面的範例輸出中,已在節點 ip-192-168-66-187.eu-west-1.compute.internal 上對 Pod aws-6799fc88d8-jqc2r 排程。

若要尋找 Kubernetes API 伺服器要求的 HTTP 5xx 伺服器錯誤,您可以使用下列範例查詢:

fields @logStream, @timestamp, responseStatus.code, @message
| filter @logStream like /^kube-apiserver-audit/
| filter responseStatus.code = 500
| limit 50

縮短的範例輸出:

@logStream,@timestamp,responseStatus.code,@message
kube-apiserver-audit-4d5145b53c40d10c276ad08fa36d1f11,2021-08-04 07:22:06.518,503,"...""requestURI"":""/apis/metrics.k8s.io/v1beta1?timeout=32s"",""verb"":""get"",""user"":{""username"":""system:serviceaccount:kube-system:resourcequota-controller"",""uid"":""36d9c3dd-f1fd-4cae-9266-900d64d6a754"",""groups"":[""system:serviceaccounts"",""system:serviceaccounts:kube-system"",""system:authenticated""]},""sourceIPs"":[""12.34.56.78""],""userAgent"":""kube-controller-manager/v1.21.2 (linux/amd64) kubernetes/d2965f0/system:serviceaccount:kube-system:resourcequota-controller"",""responseStatus"":{""metadata"":{},""code"":503},..."}}"

若要排查 CronJob 啟用問題,請搜尋 cronjob-controller 發出的 API 呼叫。

查詢範例:

fields @logStream, @timestamp, @message
| filter @logStream like /kube-apiserver-audit/
| filter user.username like "system:serviceaccount:kube-system:cronjob-controller"
| display @logStream, @timestamp, @message, objectRef.namespace, objectRef.name
| sort @timestamp desc
| limit 50

縮短的範例輸出:

{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "objectRef": { "resource": "cronjobs", "namespace": "default", "name": "hello", "apiGroup": "batch", "apiVersion": "v1" }, "responseObject": { "kind": "CronJob", "apiVersion": "batch/v1", "spec": { "schedule": "*/1 * * * *" }, "status": { "lastScheduleTime": "2021-08-09T07:19:00Z" } } }

在上述範例輸出中,預設命名空間中的 hello 任務會每分鐘執行一次,最後一次排定在 2021-08-09T07:19:00Z

若要尋找 replicaset-controller 發出的 API 呼叫,您可以使用下列範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /kube-apiserver-audit/
| filter user.username like "system:serviceaccount:kube-system:replicaset-controller"
| display @logStream, @timestamp, requestURI, verb, user.username
| sort @timestamp desc
| limit 50

範例輸出:

@logStream,@timestamp,requestURI,verb,user.username
kube-apiserver-audit-8c0c570ea5676c62c44d98da6189a02b,2021-08-10 17:13:53.281,/api/v1/namespaces/kube-system/pods,create,system:serviceaccount:kube-system:replicaset-controller
kube-apiserver-audit-4d5145b53c40d10c276ad08fa36d1f11,2021-08-04 0718:44.561,/apis/apps/v1/namespaces/kube-system/replicasets/coredns-6496b6c8b9/status,update,system:serviceaccount:kube-system:replicaset-controller

若要尋找對 Kubernetes 資源所執行的作業,您可以使用下列範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /^kube-apiserver-audit/
| filter verb == "delete" and requestURI like "/api/v1/namespaces/default/pods/my-app"
| sort @timestamp desc
| limit 10

上述範例會在 Pod my-app預設命名空間上查詢 delete API 呼叫的篩選條件。

縮短的範例輸出:

@logStream,@timestamp,@message
kube-apiserver-audit-e7b3cb08c0296daf439493a6fc9aff8c,2021-08-11 14:09:47.813,"...""requestURI"":""/api/v1/namespaces/default/pods/my-app"",""verb"":""delete"",""user"":{""username""""kubernetes-admin"",""uid"":""heptio-authenticator-aws:12345678910:ABCDEFGHIJKLMNOP"",""groups"":[""system:masters"",""system:authenticated""],""extra"":{""accessKeyId"":[""ABCDEFGHIJKLMNOP""],""arn"":[""arn:aws:iam::12345678910:user/awscli""],""canonicalArn"":[""arn:aws:iam::12345678910:user/awscli""],""sessionName"":[""""]}},""sourceIPs"":[""12.34.56.78""],""userAgent"":""kubectl/v1.22.0 (darwin/amd64) kubernetes/c2b5237"",""objectRef"":{""resource"":""pods"",""namespace"":""default"",""name"":""my-app"",""apiVersion"":""v1""},""responseStatus"":{""metadata"":{},""code"":200},""requestObject"":{""kind"":""DeleteOptions"",""apiVersion"":""v1"",""propagationPolicy"":""Background""},
..."

若要擷取對 Kubernetes API 伺服器呼叫的 HTTP 回應碼計數,您可以使用下列範例查詢:

fields @logStream, @timestamp, @message
| filter @logStream like /^kube-apiserver-audit/
| stats count(*) as count by responseStatus.code
| sort count desc

範例輸出:

responseStatus.code,count
200,35066
201,525
403,125
404,116
101,2

若要尋找 kube-system 命名空間中對 DaemonSets/Addons 所執行的變更,您可以使用下列範例查詢:

filter @logStream like /^kube-apiserver-audit/
| fields @logStream, @timestamp, @message
| filter verb like /(create|update|delete)/ and strcontains(requestURI,"/apis/apps/v1/namespaces/kube-system/daemonsets")
| sort @timestamp desc
| limit 50

範例輸出:

{ "kind": "Event", "apiVersion": "audit.k8s.io/v1", "level": "RequestResponse", "auditID": "93e24148-0aa6-4166-8086-a689b0031612", "stage": "ResponseComplete", "requestURI": "/apis/apps/v1/namespaces/kube-system/daemonsets/aws-node?fieldManager=kubectl-set", "verb": "patch", "user": { "username": "kubernetes-admin", "groups": [ "system:masters", "system:authenticated" ] }, "userAgent": "kubectl/v1.22.2 (darwin/amd64) kubernetes/8b5a191", "objectRef": { "resource": "daemonsets", "namespace": "kube-system", "name": "aws-node", "apiGroup": "apps", "apiVersion": "v1" }, "requestObject": { "REDACTED": "REDACTED" }, "requestReceivedTimestamp": "2021-08-09T08:07:21.868376Z", "stageTimestamp": "2021-08-09T08:07:21.883489Z", "annotations": { "authorization.k8s.io/decision": "allow", "authorization.k8s.io/reason": "" } }

在上述範例輸出中,kubernetes-admin 使用者使用 kubectl v1.22.2 來修補 aws-node DaemonSet。


此文章是否有幫助?


您是否需要帳單或技術支援?