How can I get my Linux instances to start when I use the StartInstance API in AWS OpsWorks?

Last updated: 2019-06-06

My Amazon Elastic Compute Cloud (Amazon EC2) Linux instances don't start when I use the StartInstance API call in AWS OpsWorks. How can I get my instances to start?

Short Description

When you issue the StartInstance API call, OpsWorks follows this process:

  1. Builds instance user data based on the operating system type.
  2. Initiates an Amazon EC2 RunInstances call, which can cause issues with Amazon EC2 limits.
  3. Runs user data and installs the OpsWorks agent that's downloaded from Amazon Simple Storage Service (Amazon S3) during the instance boot, which can cause issues if an instance can't connect to the internet.
    Note: For newer operating system versions (Ubuntu 16.04 LTS, CentOS 7), systemd manages the agent. Monit is used for monitoring and managing the agent on other supported operating system versions.
  4. Continuously polls for new commands and executes these commands after the instance is online and the agent is running.

A failure at any of these stages can cause an instance to be in the start_failed state.

Resolution

To get your Linux instances to start, complete the following troubleshooting steps:

1.    To check for errors related to the execution of the instance's user data script, check the user-data.log file at /var/log/aws/opsworks/.

2.    To verify that the OpsWorks agent is installed, check installer.log for system-related issues that could have stopped the install. For example, if installer.log doesn't exist, the boot logs can tell you if the agent was installed or not.

Note: The most common cause of failed agent installations is an incorrectly configured Amazon Virtual Private Cloud (Amazon VPC). Be sure that instances in your VPC can access the internet. Instances that can't access the internet fail when trying to download and install the agent.

3.    To see what and how many agent processes are running, run the following command:

$ ps -ef |grep opsworks-agent|wc -l
4

4.    To verify that the agent is running, run the command that's appropriate for your operating system.

If the agent is running on a more recent operating system version and doesn't use systemd, run the following command to verify that Monit is running:

$ sudo service monit status
monit (pid 1769) is running...

OR

If the agent is running on a more recent operating system version and uses systemd, inspect the journalctl logs.

OR

If your agent is running on an older operating system version that uses System V, check the following:

root@custsew2:/home/ubuntu# systemctl status opsworks-agent |grep Active
Active: active (running) since Tue 2016-06-28 17:47:03 UTC; 37s ago

Note: For stacks that use Windows, check the logs at /var/log/aws/opsworks or C:\ProgramData\OpsWorksAgent to find out why the agent isn't running.

5.    If Monit isn't running, check /var/log/messages or /var/log/system.log to troubleshoot the cause.

6.    If the agent is still in the start_failed state after completing steps 1-5, check opsworks-agent.process_command.log. This log shows you if the agent received the setup command to process. Also, verify that your internal firewalls or routing tables on the instance or the on-premises system are allowing the agent to reach the OpsWorks endpoint.

7.    If the agent is still in the start_failed state after completing step 6, then look for common errors that can prevent your instances from starting by running the describe-service-errors command:

$ aws opsworks describe-service-errors --instance-id 63133710-806b-40e7-bbd1-8eb3ccd8c20b

Did this article help you?

Anything we could improve?


Need more help?