How do I troubleshoot my Amazon MWAA environment that's stuck in the "Creating" state?
Last updated: 2022-03-02
I tried to create an Amazon Managed Workflows for Apache Airflow (Amazon MWAA) environment, but it's stuck in the "Creating" state.
Run a troubleshooting script to verify that the prerequisites for the Amazon MWAA environment, such as the required AWS Identity and Access Management (IAM) role permissions and Amazon Virtual Private Cloud (Amazon VPC) setup are met. For more information, see the Verify environment script in AWS Support Tools on GitHub.
If your Amazon MWAA environment is stuck in the "Creating" state for a shorter duration, then the issue might be due to the missing IAM permissions for other AWS services, such as the following: Amazon Simple Storage Service (Amazon S3), Amazon CloudWatch, Amazon Simple Queue Service (Amazon SQS) and Amazon Elastic Container Registry (Amazon ECR), and AWS Key Management Service (AWS KMS). Be sure that your execution role and service-linked role has the required permissions. If you're using a customer managed key, be sure to update the customer managed key policy as well. For troubleshooting steps, see I tried to create an environment but it shows the status as "Create failed".
If your environment is stuck for more than 30 minutes in the "Creating" state, then the issue might be related to the networking configuration. The root cause of the issue and the appropriate resolution depend on your networking setup.
Your network configuration lacks the route to AWS services or the internet
To resolve this issue, based on the type of routing you choose, verify that the network configuration meets the respective prerequisites for the environment:
- Public routing: Be sure that your Amazon VPC infrastructure has two public and two private subnets. Public subnets get public IP addresses and have the default route to internet gateway. Private subnets get only private IP addresses and have no route to the internet gateway. Instead, they have a route to the NAT gateway. For more information, see Public routing over the internet. Typically, the network flow with public routing looks similar to the following:
Private subnet - Default route to NAT gateway - NAT gateway associated with the public subnet - public subnet - default route to the internet gateway - internet
- Private routing: Your Amazon VPC without internet access needs additional VPC service endpoints to use the Apache Airflow on MWAA. These Amazon VPC endpoints include Amazon S3, monitoring, ecr.dkr, ecr.api, logs, sqs, kms, airflow.api, airflow.env, and airflow.ops. For more information, see Creating the required VPC service endpoints in an Amazon VPC with private routing and Private routing without internet access. Be sure that the VPC endpoints have private DNS enabled. Verify that the endpoints are associated with the environment's subnets and security group. Also, be sure that the VPC endpoint policy for each endpoint is configured to allow full access to the endpoint.
The security group or network access control list (ACL) restricts the network traffic
To resolve this issue, verify that the security group specifies a self-referencing inbound rule to itself or the port range HTTPS 443 and TCP 5432. The security group must specify an outbound rule for all traffic. The network ACL must have an inbound or outbound rule that allows all traffic. For an example, see Example ACLs.
Downloading the container image from Amazon ECR has failed
If you use an Amazon VPC without internet access, then be sure that you created an Amazon S3 gateway endpoint and granted the minimum required permissions to Amazon ECR to access Amazon S3 in that Region.
For troubleshooting issues related to the Amazon VPC network with public/private routing, see I tried to create an environment and it's stuck in the "Creating" state.