Why is my AWS Glue workflow not triggered?
Last updated: 2021-08-02
I have created an AWS Glue workflow, but it's not starting.
Some of the constituent jobs/crawlers in my AWS Glue workflow are not running.
If your AWS Glue workflow or its components aren't triggered, then confirm the following:
- If a scheduled trigger is used as the source trigger, then confirm that it's activated. Be sure that the schedule is mentioned in UTC, and that the cron expression includes all the required fields.
- If an external component is triggering the source trigger, then confirm that the external component isn't malfunctioning.
- Be sure that the predicate condition used to trigger a component isn't met by an agent external to the workflow.
- If a component is part of a dependency chain, then be sure that the upstream jobs/crawlers are started as part of the same workflow by a single source trigger.
Workflow not starting with a time-based trigger
If the workflow's source trigger is scheduled, then check the following:
- Be sure that the trigger is in the ACTIVATED state and not in the CREATED state. If the trigger isn't in the ACTIVATED state, then activate the trigger manually.
- Be sure that the cron expression used in the schedule for a scheduled trigger is in the Coordinated Universal Time (UTC). Be sure that the fields in the cron expression correspond with the conversion of the local time zone to UTC. Also, check if the cron expression includes all the required fields in the correct format. For more information, see Time-based schedules for jobs and crawlers.
Workflow not starting with an on-demand trigger
If the source trigger is on-demand, and there is an upstream entity triggering it using the StartWorkflowRun API call, then be sure that the calling entity functions correctly.
Workflow not starting with a conditional trigger
Be sure that the predicate conditions in the trigger aren't met by an agent external to the workflow. If the conditions are met by an external agent, then the trigger isn't fired. Conditional triggers are started only if the watched event is started by a trigger.
For example, suppose that the following conditions are true:
- You have a workflow with a job JOB_MAIN that's triggered by the trigger TEST_TR.
- The trigger TEST_TR is dependent on the completion of another job JOB_DEP that's not part of the current workflow.
In this case, even if JOB_DEP completes successfully, and the trigger TEST_TR's predicate logic is met, the job JOB_MAIN isn't fired. This is because, the predicate condition is met by an agent that's not part of the same workflow.
Workflow not starting for a component job/crawler that's part of a dependency chain
Check if the constituent job/crawler depends on the completion of an upstream job/crawler that is also started by a trigger. A dependent job/crawler is started only if the job/crawler that's completed was started by a trigger. Make sure that all jobs/crawlers in a dependency chain are descendants of a single scheduled or on-demand trigger.
For example, suppose that the conditions are true:
- Your workflow starts with a trigger TEST_TR1 that starts the job JOB_1.
- Another trigger TEST_TR2 depends on the completion of JOB_1 to start the job JOB_2.
In this case, TEST_TR2 starts JOB_2 when the predicate conditions for TEST_TR2 are met.
However, if JOB_1 is run on-demand and not started by TEST_TR1, then TEST_TR2 doesn't start JOB_2 even if the predicate conditions for TEST_TR2 are met.