How do I troubleshoot issues when passing environment variables to my Amazon ECS task?

8 minute read
0

I want to troubleshoot issues when passing environment variables to my Amazon Elastic Container Service (Amazon ECS) task.

Short description

You can pass an environment variable inside your Amazon ECS task in one of the following ways:

  • Pass the variable as an environmentFiles object inside an Amazon Simple Storage Service (Amazon S3) bucket.
  • Store the variable inside an AWS Systems Manager Parameter Store.
  • Store the variable in your ECS task definition.
  • Store the variable inside AWS Secrets Manager.

Note: It's a security best practice to use Parameter Store or Secrets Manager for storing your sensitive data as an environment variable. When you pass the environment variables in one of the preceding methods, you might get the following errors:

Parameter Store

"Fetching secret data from SSM Parameter Store in region: AccessDeniedException: User: arn:aws:sts::123456789:assumed-role/ecsExecutionRole/f512996041234a63ac354214 is not authorized to perform: ssm:GetParameters on resource: arn:aws:ssm:ap-south-1:12345678:parameter/status code: 400, request id: e46b40ee-0a38-46da-aedd-05f23a41e861"

-or-

"ResourceInitializationError: unable to pull secrets or registry auth: execution resource retrieval failed: unable to retrieve secrets from ssm: service call has been retried 5 time(s): RequestCanceled"

Secrets Manager

"ResourceInitializationError error"

-or-

"AccessDenied error on Amazon Elastic Compute Cloud (Amazon EC2)"

To resolve these errors, see How do I troubleshoot issues related to AWS Secrets Manager secrets in Amazon ECS?

Amazon S3

"ResourceInitializationError: failed to download env files: file download command: non empty error stream"

You might face issues when you pass environment variables to your Amazon ECS tasks due to the following reasons:

  • Your Amazon ECS task execution role doesn't have the required AWS Identity and Management (IAM) permissions.
  • There are issues with your network configuration.
  • Your application is unable to read the environment variable.
  • The format of variable in the container definition is incorrect.
  • The environment variable isn't automatically refreshed.

To troubleshoot the errors for Amazon ECS tasks that fail to start, use the AWSSupport-TroubleshootECSTaskFailedToStart runbook. Then, refer to the relevant troubleshooting steps for your issue.

Resolution

Important:

  • Use the AWSSupport-TroubleshootECSTaskFailedToStart runbook in the same AWS Region where your ECS cluster resources are located.
  • When using the runbook, you must use the most recently failed Task ID. If the failed task is part of an Amazon ECS service, then use the most recently failed task in the service. The failed task must be visible in ECS:DescribeTasks during the automation. By default, stopped ECS tasks are visible for 1 hour after entering the Stopped state. Using the most recently failed task ID prevents the task state cleanup from interrupting the analysis during the automation.

For instructions on how to initiate the runbook, see AWSSupport-TroubleshootECSTaskFailedToStart. Based on the output of the automation, use one of the following manual troubleshooting steps.

Your Amazon ECS task execution role doesn't have the required IAM permissions

If you're using environment variables inside Parameter Store or Secrets Manage, then review AWS CloudTrail events for either of the following API calls:

GetParameters for Parameter Store

-or-

GetSecretValue for Secrets Manager

If you notice the AccessDenied error for task execution role in CloudTrail events, then manually add the required permissions as an inline policy to your ECS task execution IAM role. You can also create a customer managed policy and add the policy to your ECS task execution role.

If you're using Secrets Manager, then include the following permissions to your task execution role:

{  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "secretsmanager:GetSecretValue",
        "kms:Decrypt"
      ],
      "Resource": [
        "arn:aws:secretsmanager:example-region:11112222333344445555:secret:example-secret",
        "arn:aws:kms:example-region:1111222233334444:key/example-key-id"
      ]
    }
  ]
}

If you're using the Parameter Store, then include the following permissions to your task execution role:

{  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ssm:GetParameters",
        "secretsmanager:GetSecretValue",
        "kms:Decrypt"
      ],
      "Resource": [
        "arn:aws:ssm:example-region:1111222233334444:parameter/example-parameter",
        "arn:aws:secretsmanager:example-region:1111222233334444:secret:example-secret",
        "arn:aws:kms:example-region:1111222233334444:key/example-key-id"
      ]
    }
  ]
}

You can use an S3 bucket for storing the environment variable as a .env file. However, you must manually add the following permissions as an inline policy to the task execution role:

{  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject"
      ],
      "Resource": [
        "arn:aws:s3:::example-bucket/example-folder/example-env-file"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetBucketLocation"
      ],
      "Resource": [
        "arn:aws:s3:::example-bucket"
      ]
    }
  ]
}

There are issues with your network configuration

If your ECS task is in a private subnet, then verify the following points:

  • Be sure that the security group for the task or service allows egress traffic on port 443.
  • If you use a VPC endpoint, then be sure that the network access control list (ACL) allows egress traffic on port 443.
  • Verify the connectivity to Systems Manager/Secrets Manager and Amazon S3 endpoint. To do this, use the telnet command.
  • If you use a NAT gateway, then be sure that your task has a default route to the NAT gateway.
  • Define the VPC endpoints for your tasks. Verify that you have the required VPC endpoints for Secrets Manager/Systems Manager Parameter Store and Amazon S3.

If you use a VPC endpoint, then verify the following points:

  • The security group for your VPC endpoint allows egress traffic from the task or service on port 443.
  • Associate the VPC endpoint with the correct VPC.
  • Turn on the VPC attributes enableDnsHostnames and enableDnsSupport.

If your ECS task is in a public subnet, then verify the following points:

  • You must activate a public IP address for the task.
  • Be sure that the security group of your VPC has outbound access on port 443 to the internet.
  • The network ACL configuration allows all traffic to flow in and out of the subnets to the internet.

Your application is unable to read the environment variable

To check whether the correct environment variables are populated inside your task container, do the following:

  1. List out all the environment variables that are exposed inside the container.
  2. Verify that this list includes the environment variables that you defined in the task definition or the .env file in S3.

If you're using the Amazon EC2 or AWS Fargate launch types, then it's a best practice to use the ECS Exec feature. You can use this feature to run commands in or get a shell to a container running on an Amazon EC2 instance or Fargate. After enabling this feature, run the following command to interact with your container:

aws ecs execute-command --cluster example-cluster \--task example-task-id \
--container example-container \
--interactive \
--command "/bin/sh"

If you use the Amazon EC2 launch type, then you can also use the Docker exec command to interact with your container. In this case, complete the following steps: Connect to the container instance where your task is running. Then, run the following Docker command to find the container ID of your task container:

docker container ps

To interact with the container, run the following 

docker exec -it example-container-id bash

Note: Select the shell according to your container default shell.

After you establish connection with the container, run the env command on your container to get the complete list of your environment variables. Review this list to make sure that the environment variables that you defined in the task definition or .env file are present.

The format of variable in the container definition is incorrect

When you define environment variables inside container definition, define the environment variables as KeyValuePair objects:

"environment": [{    "name": "foo",
    "value": "bar"
}]

Be sure to use this format when you define the environment variables in your .env files as well.

The environment variable isn't automatically refreshed

When you update the environmental variable in your .env file, the variable doesn't get automatically refreshed in your running container.
To inject the updated values of environmental variables in your task, update the service by running the following command:

aws ecs update-service --cluster example-cluster --service example-service --force-new-deployment

If you use environment variables in your container definition, then you must create a new task definition to refresh the updated environment variables. With this new task definition, you can create a new task or update your ECS service:

`aws ecs update-service --cluster example-cluster --service example-service --task-definition <family:revision>`;

Note: Keep the following points in mind when you pass environment variables to your task:

  • If you specify environment variables with the environment parameter in a container definition, then they take precedence over the variables contained within an environment file.
  • If you specify multiple environment files and they contain the same variable, then they process in the order of entry. The first value of the variable is used, and subsequent values of duplicate variables are ignored. It's a best practice to use unique variable names.
  • If you specify an environment file as a container override, then the file is used. Any other environment files specified in a container definition are ignored.
  • The environment variables are available to the PID 1 processes in a container from the file /proc/1/environ. If the container runs multiple processes or init processes, such as wrapper script or supervisord, then the environment variable is unavailable to non-PID 1 processes.

Related information

Passing environment variables to a container