Why do I experience a data delivery failure with Kinesis Data Firehose that has an OpenSearch Service domain as a destination?

6 minute read
0

I want to send data from Amazon Kinesis Data Firehose to my Amazon OpenSearch Service domain, but I experience data delivery failure.

Short description

The following reasons can cause a failed delivery between Kinesis Data Firehose and Amazon OpenSearch Service:

  • Delivery destination that's not valid
  • Lack of proper permissions
  • AWS Lambda function invocation issues
  • OpenSearch Service domain health issues

Resolution

Turn on Logging for Kinesis Data Firehose. Deliver your error logs to Amazon CloudWatch to help narrow the issue. Then, check for the /aws/kinesisfirehose/delivery-stream-name log group name in Amazon CloudWatch Logs.

To deliver logs to your CloudWatch log group, the Kinesis Data Firehose role must have the following permissions:

{     "Action": [
          "logs:PutLogEvents"     ]
},
{
     "Resource": [
          "arn:aws:logs:region:account-id:log-group:log-group-name:log-stream:log-stream-name"
     ]
}

Verify that you grant Kinesis Data Firehose access to a public OpenSearch Service destination. If you use the data transformation feature, then you must also grant access to Lambda. For more information, see Ingest streaming data into Amazon Elasticsearch Service within the privacy of your VPC with Amazon Kinesis Data Firehose.

Delivery destination that's not valid

Confirm that you specified a valid Kinesis Data Firehose delivery destination and that you use the correct ARN. View the DeliveryToElasticsearch.Success metric in CloudWatch to check if your delivery was successful. A metric value of zero is confirmation that the deliveries were unsuccessful. For more information about the DeliveryToElasticsearch.Success metric, see Delivery to OpenSearch Service in Data delivery CloudWatch metrics.

Lack of proper permissions

Based on the configuration of Kinesis Data Firehose, several permissions are required.

To deliver records to an Amazon Simple Storage Service (Amazon S3) bucket, the following permissions are required:

{      
     "Effect": "Allow",
     "Action": [
          "s3:AbortMultipartUpload",
          "s3:GetBucketLocation",
          "s3:GetObject",
          "s3:ListBucket",
          "s3:ListBucketMultipartUploads",
          "s3:PutObject"
     ],
     "Resource": [
          "arn:aws:s3:::bucket-name",
          "arn:aws:s3:::bucket-name/*"
     ]
}

Note: To use this policy, the Amazon S3 bucket resource must be present.

If your Kinesis Data Firehose is encrypted at rest, then the following permissions are required:

{
     "Effect": "Allow",
     "Action": [
          "kms:Decrypt",
          "kms:GenerateDataKey"
     ],
     "Resource": [
          "arn:aws:kms:region:account-id:key/key-id"
     ],
     "Condition": {
          "StringEquals": {
               "kms:ViaService": "s3.region.amazonaws.com"
          },
          "StringLike": {
               "kms:EncryptionContext:aws:s3:arn": "arn:aws:s3:::bucket-name/prefix*"
          }
     }
}

To allow permissions for OpenSearch Service access, update your permissions:

{
     "Effect": "Allow",
     "Action": [
          "es:DescribeElasticsearchDomain",
          "es:DescribeElasticsearchDomains",
          "es:DescribeElasticsearchDomainConfig",
          "es:ESHttpPost",
          "es:ESHttpPut"
     ],
     "Resource": [
          "arn:aws:es:region:account-id:domain/domain-name",
          "arn:aws:es:region:account-id:domain/domain-name/*"
     ]
},
{
     "Effect": "Allow",
     "Action": [
          "es:ESHttpGet"
     ],
     "Resource": [
          "arn:aws:es:region:account-id:domain/domain-name/_all/_settings",
          "arn:aws:es:region:account-id:domain/domain-name/_cluster/stats",
          "arn:aws:es:region:account-id:domain/domain-name/index-name*/_mapping/type-name",
          "arn:aws:es:region:account-id:domain/domain-name/_nodes",
          "arn:aws:es:region:account-id:domain/domain-name/_nodes/stats",
          "arn:aws:es:region:account-id:domain/domain-name/_nodes/*/stats",
          "arn:aws:es:region:account-id:domain/domain-name/_stats",
          "arn:aws:es:region:account-id:domain/domain-name/index-name*/_stats"
     ]
}

If you use Kinesis Data Streams as a source, then update your permissions:

{
     "Effect": "Allow",
     "Action": [
          "kinesis:DescribeStream",
          "kinesis:GetShardIterator",
          "kinesis:GetRecords",
          "kinesis:ListShards"
     ],
     "Resource": "arn:aws:kinesis:region:account-id:stream/stream-name"
}

To configure Kinesis Data Firehose for data transformation, update your policy:

{
     "Effect": "Allow",
     "Action": [
          "lambda:InvokeFunction",
           "lambda:GetFunctionConfiguration"
      ],
     "Resource": [ 
         "arn:aws:lambda:region:account-id:function:function-name:function-version"
     ]
}

If you turn on fine-grained access control (FGAC) on your cluster, then log in to OpenSearch Dashboards and add a role mapping. The roll mapping allows the Kinesis Data Firehose role to send requests to OpenSearch Service.

To log in to OpenSearch Dashboards and add a role mapping, complete the following steps:

  1. Open Dashboards.
  2. Choose the Security tab.
  3. Choose Roles.
  4. Choose the all_access role.
  5. Choose the Mapped users tab.
  6. Choose Manage mapping.
  7. In the Backend roles section, enter the Kinesis Data Firehose role.
  8. Choose Map.

AWS Lambda function invocation issues

Check the Kinesis Data Firehose ExecuteProcessing.Success and Errors metric to confirm that Kinesis Data Firehose invokes your function. If Kinesis Data Firehose didn't invoke your Lambda function, then check the invocation time to see if it's beyond the timeout parameter. Your Lambda function might require a greater timeout value or need more memory to complete in time. For more information about invocation metrics, see Invocation metrics.

To identify the reasons that Kinesis Data Firehose doesn't invoke the Lambda function, check the CloudWatch Logs group for /aws/lambda/lambda-function-name. If data transformation fails, then the failed records are delivered to the S3 bucket as a backup in the processing-failed folder. The records in your S3 bucket also contain the error message for failed invocation. For more information about resolving Lambda invocation failures, see Data transformation failure handling.

OpenSearch Service domain health issues

Check the following metrics to make sure that OpenSearch Service is in good health:

  • CPU utilization: If this metric is consistently high, then the data node might not be able to respond to any requests or incoming data. You might need to scale your cluster.
  • JVM memory pressure: If the JVM memory pressure is consistently above 80%, then the cluster might be initiating memory circuit breaker exceptions. These exceptions can prevent the data from being indexed.
  • ClusterWriteBlockException: This indexing block occurs when your domain is under high JVM memory pressure or when more storage space is needed. If a data node doesn't have enough space, then new data can't be indexed. For more information about troubleshooting OpenSearch Service issues, see Troubleshooting Amazon OpenSearch Service.

No incoming data

To confirm that there's incoming data for Kinesis Data Firehose, monitor the IncomingRecords and IncomingBytes metrics. A value of zero means that there are no records that reach Kinesis Data Firehose. For more information about the IncomingRecords and IncomingBytes metrics, see Data ingestion through direct PUT in Data ingestion metrics.

If the delivery stream uses Amazon Kinesis Data Streams as a source, then check the IncomingRecords and IncomingBytes metrics of the Kinesis data stream. These two metrics indicate incoming data. A value of zero confirms that there are no records that reach the streaming services.

Check the DataReadFromKinesisStream.Bytes and DataReadFromKinesisStream.Records metrics to verify if data is coming from Kinesis Data Streams to Kinesis Data Firehose. For more information about the data metrics, see Data ingestion through Kinesis Data Streams in Data ingestion metrics. A value of zero can indicate a failure to deliver to OpenSearch Service rather than a failure between Kinesis Data Streams and Kinesis Data Firehose.

You can also check to see if the PutRecord and PutRecordBatch API calls for Kinesis Data Firehose are called properly. If you don't see any incoming data flow metrics, then check the producer that performs the PUT operations. For more information about troubleshooting producer application issues, see Troubleshooting Amazon Kinesis Data Streams producers.

AWS OFFICIAL
AWS OFFICIALUpdated 5 months ago