How do I install custom packages in my Amazon MWAA environment?
Last updated: 2022-03-03
I want to install custom packages using plugins.zip in my Amazon Managed Workflows for Apache Airflow (Amazon MWAA) environment.
You can install Python libraries in Amazon MWAA using the requirements.txt and plugins.zip files. When you install packages using the requirements.txt file, the packages are installed from PyPi.org, by default. However, if you need to ship libraries (.whl files) with compiled artifacts, then you can install these Python wheels using the plugins.zip file.
Using the plugins.zip file, you can install custom Apache Airflow operators, hooks, sensors, or interfaces by simply dropping files inside a plugins.zip file. This file is written on the backend Amazon ECS Fargate containers at the location /usr/local/airflow/plugins/. Plugins can also be used to export environment variables and authentication and config files, such as .crt and .yaml.
Install libraries using Python Wheels
A Python wheel is a package file with compiled artifacts. You can install this package by placing the (.whl) file in a plugins.zip and then refer to this file in requirements.txt. When you update the environment after adding the .whl file into plugins.zip, the .whl file is shipped to the location /usr/local/airflow/plugins/ in the underlying Amazon Elastic Container Service (Amazon ECS) Fargate containers. To install Python Wheels, do the following:
1. Create the plugins.zip file:
Create a local Airflow plugins directory on your system by running the following command:
$ mkdir plugins
Copy the .whl file into the plugins directory that you created.
Change the directory to point to your local Airflow plugins directory by running the following command:
$ cd plugins
plugins$ chmod -R 755
Zip the contents within your plugins folder by running the following command:
plugins$ zip -r plugins.zip
2. Include the path of the .whl file in the requirements.txt file (example: /usr/local/airflow/plugins/example_wheel.whl).
Note: Remember to turn on versioning for your Amazon Simple Storage Service (Amazon S3) bucket.
3. Upload the plugins.zip and requirements.txt files into an S3 bucket (example: s3://example-bucket/plugins.zip).
4. To edit the environment, open the Environments page on the Amazon MWAA console.
5. Select the environment from the list, and then choose Edit.
6. On the Specify details page, in the DAG code in Amazon S3 section, do either of the following based on your use case:
Note: If you're uploading the plugins.zip or requirements.txt file into your environment for the first time, then select the file and then choose the version. If you already uploaded the file and recently updated it, then you can skip selecting the file and choose only the version.
Choose Browse S3 under the Plugins file - optional field.
Select the plugins.zip file in your Amazon S3 bucket, and then choose Choose.
For Choose a version under Plugins file - optional, select the latest version of the file that you've uploaded.
Choose Browse S3 under Requirements file - optional.
Select the requirements.txt file in your Amazon S3 bucket, and then choose Choose.
For Choose a version under Requirements file - optional, select the latest version of the file that you've uploaded.
7. Choose Next, and then choose Save.
Install custom operators, hooks, sensors, or interfaces
Amazon MWAA supports Apache Airflow’s built-in plugin manager that allows you to use custom Apache Airflow operators, hooks sensors, or interfaces. These custom plugins can be placed in the plugins.zip file using both a flat and nested directory structure. For some examples of custom plugins, see Examples of custom plugins.
Create a custom plugin to generate runtime environment variables
You can also create a custom plugin that generates environment variables at runtime on your Amazon MWAA environment. Then, you can use these environment variables in your DAG code. For more information, see Creating a custom plugin that generates runtime environment variables.
Export PEM, .crt, and configuration files
If you don't need specific files to be continuously updated during environment execution, you can use plugins.zip to ship these files. Also, you can place files for which you don't need to grant access to users that write DAGs (example: certificate (.crt), PEM, and configuration YAML files). After you zip these files into plugins.zip, upload plugins.zip to S3, and then update the environment. The files are replicated with required permissions to access /usr/local/airflow/plugins.
You can zip custom CA certifcates into the plugins.zip file by running the following command:
$ zip plugins.zip ca-certificates.crt
To zip the kube_config.yaml into the plugins.zip file, run the following command:
$ zip plugins.zip kube_config.yaml
Troubleshoot the installation process
If you have issues during the installation of these packages, you can test your DAGs, custom plugins, and Python dependencies locally using aws-mwaa-local-runner.
To troubleshoot issues with installation of Python packages using the plugins.zip file, you can view the log file (requirements_install_ip) from either the Apache Airflow Worker or Scheduler log groups.
Important: It's a best practice to test the Python dependencies and plugins.zip file using the Amazon MWAA CLI utility (aws-mwaa-local-runner) before installing the packages or plugins.zip file on your Amazon MWAA environment.