如何在使用 Kinesis 代理将日志推送到 Kinesis 时添加 Amazon EC2 元数据?

上次更新时间:2020 年 5 月 12 日

我尝试使用 Amazon Kinesis 代理将日志从 Amazon Elastic Compute Cloud (Amazon EC2) 发送到 Amazon Kinesis。如何在每个日志行中附加 Amazon EC2 元数据?

简短描述

您可以通过执行以下操作将 EC2 元数据附加到每个日志行:

1.    在 Linux 或 Windows 平台上安装并设置 Kinesis 代理。

2.    更新您的配置设置以显示 EC2 元数据。

3.    验证您拥有所需的 AWS Identity and Access Management (IAM) 权限。

注意:您需要 IAM 权限才能检索 EC2 元数据和将数据发布到 Amazon Kinesis Data Firehose。

解决方法

在 Linux 平台上

要使用 Kinesis 代理将日志发送到 Kinesis Data Firehose,请执行以下步骤:

1.    下载并安装代理

2.    创建具有以下权限的 IAM 角色:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "firehose:PutRecord",
        "firehose:PutRecordBatch"
      ],
      "Resource": [
        "<KFH ARN>"
      ]
    },
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances",
        "ec2:DescribeInstanceAttribute",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeInstanceStatus"
      ],
      "Resource": "*"
    },
    {
      "Sid": "VisualEditor1",
      "Effect": "Allow",
      "Action": "cloudwatch:PutMetricData",
      "Resource": "*"
    }
  ]
}

3.    将新创建的 IAM 角色附加到您安装 Kinesis 代理所在的 EC2 实例。有关分配现有 IAM 角色的更多信息,请参阅如何将现有 IAM 角色分配给 EC2 实例?

4.    编辑 /etc/aws-kinesis/agent.json 文件:

{
  "cloudwatch.emitMetrics": true,
  "kinesis.endpoint": "",
  "firehose.endpoint": "firehose.us-east-1.amazonaws.com",
  
  "flows": [
    {
      "filePattern": "/tmp/app.log*",
      "deliveryStream": "yourdeliverystream",
      "partitionKeyOption": "RANDOM",
      "dataProcessingOptions": [
	      {
                    "optionName": "LOGTOJSON",
                    "logFormat": "COMMONAPACHELOG"
              },
	      {
	        "optionName": "ADDEC2METADATA",
		"logFormat": "COMMONAPACHELOG"
	      }
      ]
    }
  ]
}

在本示例中,/etc/aws-kinesis/agent.json 文件处理 COMMONAPACHELOG 日志文件格式。请注意,如果您的日志文件为另一种格式,必须先将 dataProcessingOptions 设置更新为您的日志文件格式。有关代理和处理选项的更多信息,请参阅使用代理预处理数据

重要提示:添加 optionName 字段的 ADDEC2METADATA,以确保 EC2 元数据附加到每个日志行。默认情况下,Kinesis 代理会在 EC2 元数据中附加以下参数:privateIpavailabilityZoneinstanceIdinstanceTypeaccountIdamiIdregionmetadataTimestamp

5.    配置并启动代理。现在,代理作为系统服务在后台运行。它会持续监控指定的文件,从而将数据发送到指定的传输流。代理活动记录在 /var/log/aws-kinesis-agent/aws-kinesis-agent.log 文件中,类似于此示例输出:

{
        "host": "157.92.12.106",
        "ident": null,
        "authuser": null,
        "datetime": "31/Aug/1995:20:50:31 -0400",
        "request": "GET /history/astp/astp-spacecraft.txt HTTP/1.0",
        "response": "200",
        "bytes": "440",
        "privateIp": "X.X.X.X",
        "availabilityZone": "us-east-1c",
        "instanceId": "i-01bxxxxxxxxxx43a0",
        "instanceType": "t2.xlarge",
        "accountId": "585xxxxxx740",
        "amiId": "ami-0fc61db8544a617ed",
        "region": "us-east-1",
        "metadataTimestamp": "2020-04-20T02:28:40+0000"
    }

在 Windows 平台上

要使用 Amazon Kinesis Tap 代理将数据发送到 Data Firehose,请执行以下步骤:

1.    安装适用于 Windows 的 Kinesis 代理

2.    创建具有以下权限的 IAM 角色:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "firehose:DeleteDeliveryStream",
        "firehose:PutRecord",
        "firehose:PutRecordBatch",
        "firehose:UpdateDestination"
      ],
      "Resource": [
        "<KFH ARN>"
      ]
    },
    {
      "Sid": "VisualEditor0",
      "Effect": "Allow",
      "Action": [
        "ec2:DescribeInstances",
        "ec2:DescribeInstanceAttribute",
        "ec2:DescribeInstanceTypes",
        "ec2:DescribeInstanceStatus"
      ],
      "Resource": "*"
    },
    {
      "Sid": "VisualEditor1",
      "Effect": "Allow",
      "Action": "cloudwatch:PutMetricData",
      "Resource": "*"
    }
  ]
}

3.    将新创建的 IAM 角色附加到您安装 Kinesis Tap 代理所在的 EC2 实例。

4.    打开 C:\Program Files\Amazon\AWSKinesisTap\appsettings.json 文件:

{
    "Sources": [
        {
            "Id": "W3SVCLog1",
            "SourceType": "W3SVCLogSource",
            "Directory": "C:\\inetpub\\logs\\LogFiles\\W3SVC1",
            "FileNameFilter": "*.log",
            "TimeZoneKind": "UTC"
        }
    ],
    "Sinks": [
        {
            "Id": "W3SVCLogSink",
            "SinkType": "KinesisFirehose",
            "Region": "eu-central-1",
            "StreamName": " W3SVCLogStream",
	"Format": "json",
	"ObjectDecoration": "instance_id={instance_id};hostname={hostname};ec2:local-hostname={ec2:local-hostname};computername={computername};env:computername={env:computername};timestamp:yyyyMMdd={timestamp:yyyyMMdd}"
        }		
	
    ],
    "Pipes": [
        {
            "Id": "W3SVCLog1ToKinesisStream",
            "SourceRef": "W3SVCLog1",
            "SinkRef": "W3SVCLogSink"
        }
    ]
}

重要提示:添加 "ObjectDecoration": "instance_id={instance_id};hostname={hostname};ec2:local-hostname={ec2:local-hostname};computername={computername};env:computername={env:computername};timestamp:yyyyMMdd={timestamp:yyyyMMdd}" 到您的接收器中,以确保 EC2 元数据附加到每个日志行。Kinesis Tap 代理将以下参数附加为 EC2 元数据:instance_idhostnameec2:local-hostnamecomputernameenv:computernametimestamp:yyyyMMdd。如果您不想显示上述所有参数,则指定您想要附加的参数。

有关配置选项的更多信息,请参阅配置适用于 Microsoft Windows 的 Amazon Kinesis 代理

5.    配置并启动适用于 Windows 的 Kinesis 代理以启动 Kinesis Tap 代理。下面的输出应显示在每个日志行中:

{
    "EventId": 7036,
    "Description": "The WinHTTP Web Proxy Auto-Discovery Service service entered the stopped state.",
    "LevelDisplayName": "Informational",
    "LogName": "System",
    "MachineName": "EC2AMAZ-GLL60A7",
    "ProviderName": "Service Control Manager",
    "TimeCreated": "2020-04-20T06:02:51.5847181Z",
    "Index": 34427,
    "UserName": null,
    "Keywords": "Classic",
    "instance_id": "i-0183xxxxxxxxxx4b7",
    "hostname": "ip-x-x-x-x.ec2.internal",
    "ec2:local-hostname": "ip-x-x-x-x.ec2.internal",
    "computername": "EC2AMAZ-GLL60A7",
    "env:computername": "EC2AMAZ-GLL60A7",
    "timestamp:yyyyMMdd": "20200420"
}