Posted On: Nov 19, 2018
Today we are announcing the general availability of EMR Notebooks, a managed environment, based on Jupyter Notebooks that allows data scientists, analysts, and developers to prepare and visualize data, collaborate with peers, build applications, and perform interactive analysis using EMR clusters.
EMR Notebooks is pre-configured for Spark. It supports Spark magic kernels allowing you to interactively run Spark jobs on EMR clusters written in languages such as PySpark, Spark SQL, Spark R, and Scala. The notebooks come packaged with open-source libraries found in Conda allowing you to import these libraries and use them to manipulate data and visualize computational results in rich graphical plots. Further, each notebook has integrated Spark monitoring capabilities that let you monitor the progress of your jobs and debug code directly from the notebook.
You can create multiple notebooks directly from the console. There is no software or instances to manage, and notebooks spin up instantly, you have a choice of either attaching the notebook to an existing cluster or provision a new cluster directly from the console. You can attach multiple notebooks to a single cluster, detach notebooks and re-attach them to new clusters.
EMR Notebooks saves your notebook files periodically to your Amazon S3 buckets. Saved notebooks can be retrieved from the EMR console or downloaded from your S3 bucket.
To learn more, please visit the EMR Notebooks page.
There is no additional cost for using EMR Notebooks. You only pay for the EMR cluster attached to the notebook. You can find out more about the pricing for your cluster by visiting Amazon EMR pricing.
EMR Notebooks is available in the US East (N.Virgina and Ohio), US West (N.California and Oregon), Canada (Central), EU(Frankfurt, Ireland, and London), and Asia Pacific (Mumbai, Seoul, Singapore, Sydney, and Tokyo) regions.