为什么统一的 CloudWatch 代理程序没有将我的指标或日志事件推送到 CloudWatch?
我已经在 Amazon Elastic Compute Cloud (Amazon EC2) 实例上配置了统一的 CloudWatch 代理程序,以便将指标和日志发布到 Amazon CloudWatch。但我无法在 CloudWatch 控制台中看到我的指标或日志。为什么统一的 CloudWatch 代理程序没有将我的指标和日志推送到 CloudWatch?
简短描述
统一的 CloudWatch 代理程序没有将您的指标或日志推送到 CloudWatch 的可能原因有几种。例如,您可能遇到权限或连接错误,导致代理程序无法发布您的指标。在查看统一 CloudWatch 代理程序日志时,您可能会看到以下错误:
- 代理程序日志错误:没有连接到端点
- 代理程序日志错误:权限不足
解决方法
**注意:**如果您在运行 AWS Command Line Interface (AWS CLI) 命令时收到错误消息,请确保您使用的是最新版本的 AWS CLI。
查看统一 CloudWatch 代理程序的日志
使用代理程序日志文件帮助排查在使用统一 CloudWatch 代理程序包时遇到的问题。您可能会遇到以下常见问题之一:
- 您遇到与所需的 AWS 服务终端节点或 VPC 终端节点的连接问题。
- 您没有对 CloudWatch 进行支持 API 调用的正确权限。
- 本地文件系统上不存在日志文件。
您可能会在日志中看到以下错误之一:
代理程序日志错误:没有连接到端点
2021-08-30T04:07:46Z E! cloudwatch: code: RequestError, message: send request failed, original error: Post "https://monitoring.us-east-1.amazonaws.com/": dial tcp 172.31.11.121:443: i/o timeout 2021-08-30T04:07:46Z W! 210 retries, going to sleep 1m0s before retrying. 2021-08-30T04:07:46Z E! cloudwatch: code: RequestError, message: send request failed, original error: Post "https://monitoring.us-east-1.amazonaws.com/": dial tcp 172.31.11.121:443: i/o timeout 2021-08-30T04:07:46Z W! 211 retries, going to sleep 1m0s before retrying.
代理程序日志错误:权限不足
2021-08-30T02:15:45Z E! cloudwatch: code: AccessDenied, message: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: cloudwatch:PutMetricData, original error: 2021-08-30T02:15:45Z W! 1 retries, going to sleep 400ms before retrying. 2021-08-30T02:15:46Z E! WriteToCloudWatch failure, err: AccessDenied: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: cloudwatch:PutMetricData status code: 403, request id: f1171fd0-05b6-4f7d-bac2-629c8594c46e
确认与 CloudWatch 端点的连接
当流向 CloudWatch 的流量不应通过公共互联网传输时,您可以改用 VPC 终端节点。如果您使用的是 VPC 终端节点,请检查以下内容:
- 如果您使用的是私有域名服务器,请确认 DNS 解析提供了准确的响应。
- 确认 CloudWatch 端点已解析为私有 IP 地址。
- 确认与 VPC 终端节点关联的安全组允许来自主机的入站流量。
1. 检查与指标端点的连接:
$ telnet monitoring.us-east-1.amazonaws.com 443 Trying 52.46.138.115... Connected to monitoring.amazonaws.com. Escape character is '^]'. ^] telnet> quit Connection closed.
2. 检查与日志端点的连接:
$ telnet logs.us-east-1.amazonaws.com 443 Trying 3.236.94.218... Connected to logs.us-east-1.amazonaws.com. Escape character is '^]'. ^] telnet> quit Connection closed
3. 检查 VPC 终端节点是否解析为私有 IP 地址:
$ dig monitoring.us-east-1.amazonaws.com +short 172.31.11.121 172.31.0.13
查看统一 CloudWatch 代理程序的配置
代理程序配置文件详细列出了发布到 CloudWatch 的指标和日志。查看代理程序配置文件以确认是否包含要发布的日志和指标。
确认主机有发布指标和日志的权限
AWS 托管策略 CloudWatchAgentServerPolicy 和 CloudWatchAgentAdminPolicy 可以帮助您部署统一的 CloudWatch 代理程序并检查您是否有正确的权限。使用这些策略作为参考,确保您的主机拥有正确的权限。
这些示例中的 AWS CLI 输出显示权限不足。
此代理程序启动命令输出显示没有 IAM 角色附加到 EC2 实例:
$ /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:CWT-Web-Server -s ****** processing amazon-cloudwatch-agent ****** /opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source ssm:CWT-Web-Server --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default Region: us-east-1 credsConfig: map[] Error in retrieving parameter store content: NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors Fail to fetch/remove json config: NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors Fail to fetch the config!
此代理启动命令输出显示错误的 IAM 角色附加到 EC2 实例:
$ /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:CWT-Web-Server -s ****** processing amazon-cloudwatch-agent ****** /opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source ssm:CWT-Web-Server --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default Region: us-east-1 credsConfig: map[] Error in retrieving parameter store content: AccessDeniedException: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: ssm:GetParameter on resource: arn:aws:ssm:us-east-1:123456789012:parameter/CWT-Web-Server status code: 400, request id: b85b0a7a-0fb1-47b4-924f-be8cf43a3b4d Fail to fetch/remove json config: AccessDeniedException: User: arn:aws:sts::123456789012:assumed-role/cwagent/i-0744de7c842d2c2ba is not authorized to perform: ssm:GetParameter on resource: arn:aws:ssm:us-east-1:123456789012:parameter/CWT-Web-Server status code: 400, request id: b85b0a7a-0fb1-47b4-924f-be8cf43a3b4d Fail to fetch the config!
在某些情况下,IAM 用户可能在命令行中。获取用户/角色命令返回与实例关联的 IAM 用户或角色:
$ aws sts get-caller-identity { "UserId": "AROA123456789012ABCDE:i-0744de7c842d2c2ba", "Account": "123456789012", "Arn": "arn:aws:sts::123456789012:assumed-role/CloudWatchAgentServerRole/i-0744de7c842d2c2ba" }
确认代理程序已正确启动
代理程序设计为通过 AWS CLI 启动,并以参数形式传递配置文件。使用这些有效的启动命令。
Linux 命令:
- `$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:configuration-file-path` - `$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c ssm:configuration-parameter-store-name`
Windows 命令:
- `& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m ec2 -s -c file:"C:\Program Files\Amazon\AmazonCloudWatchAgent\config.json"` - `& "C:\Program Files\Amazon\AmazonCloudWatchAgent\amazon-cloudwatch-agent-ctl.ps1" -a fetch-config -m ec2 -s -c ssm:configuration-parameter-store-name`
重要提示:不要从 Windows 控制面板启动代理程序。
确认代理程序正在运行
要发布指标和日志,代理程序必须处于运行状态。运行此命令以确认代理程序处于活动状态。
$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status { "status": "running", "starttime": "2021-08-30T02:13:44+00:00", "configstatus": "configured", "cwoc_status": "stopped", "cwoc_starttime": "", "cwoc_configstatus": "not configured", "version": "1.247349.0b251399" }
更新代理程序配置后重新启动代理程序
代理程序不会自动将更改注册到配置文件中。如果代理程序配置已更新为包括新的或不同的指标和日志,请使用以下命令重启代理程序:
$ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop ****** processing cwagent-otel-collector ****** cwagent-otel-collector has already been stopped ****** processing amazon-cloudwatch-agent ****** Redirecting to /bin/systemctl stop amazon-cloudwatch-agent.service $ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:config.json ****** processing amazon-cloudwatch-agent ****** /opt/aws/amazon-cloudwatch-agent/bin/config-downloader --output-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --download-source file:config.json --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_config.json.tmp Start configuration validation... /opt/aws/amazon-cloudwatch-agent/bin/config-translator --input /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.json --input-dir /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d --output /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml --mode ec2 --config /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml --multi-config default 2021/08/31 02:45:37 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_config.json.tmp ... Valid Json input schema. I! Detecting run_as_user... Configuration validation first phase succeeded /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml Configuration validation second phase succeeded Configuration validation succeeded amazon-cloudwatch-agent has already been stopped Redirecting to /bin/systemctl restart amazon-cloudwatch-agent.service $ sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status { "status": "running", "starttime": "2021-08-31T02:45:37+0000", "configstatus": "configured", "cwoc_status": "stopped", "cwoc_starttime": "", "cwoc_configstatus": "not configured", "version": "1.247349.0b251399" }
相关信息
如何安装和配置统一的 CloudWatch 代理程序,以便从我的 EC2 实例将指标和日志推送到 CloudWatch?
相关内容
- AWS 官方已更新 2 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 2 年前
- AWS 官方已更新 2 年前