Amazon EMR announces support for runtime installation of external libraries with EMR Notebooks

Posted on: Aug 20, 2019

You can now install external Python libraries on EMR clusters at runtime using EMR Notebooks. Before this feature, you had to use a bootstrap action or use a custom AMI to install additional libraries not packaged with the AMI before you launched the EMR cluster. This feature allows you to import your preferred libraries and use them to build your Spark application, analyze data, and visualize the results from within your notebook. The Python libraries you install using EMR Notebooks are isolated to the notebook session and will not interfere with existing libraries on the EMR cluster. You can import these libraries from either public or private PyPI repositories. Please visit Using Notebook-scoped Libraries to learn more about this feature.

This feature is available starting EMR release 5.26.0.

EMR Notebooks is is available in the US East (N.Virgina and Ohio), US West (N.California and Oregon), Canada (Central), EU(Frankfurt, Ireland, and London), and Asia Pacific (Mumbai, Seoul, Singapore, Sydney, and Tokyo) regions