Amazon SageMaker Unified Studio Notebooks now support EMR Serverless
Amazon SageMaker Unified Studio Notebooks now support Amazon EMR Serverless with Apache Spark Connect, giving data engineers and analysts more flexibility in choosing their Spark runtime for interactive analytics and data engineering workloads. In addition to Amazon Athena Spark, users can now leverage Amazon EMR Serverless as their Spark runtime, selecting the optimal engine based on their requirements.
With this launch, you can run PySpark and Spark SQL on an EMR Serverless Spark Application in Notebook cells. Users can select their Spark runtime from the Notebook side panel, and the selected runtime applies to both Python and SQL cells. Additionally, users can leverage SageMaker Data Agent, the built-in AI assistant, to generate code and execution plans from natural language prompts, accelerating Spark development workflows with EMR Serverless. Organizations can leverage pre-initialized capacity to improve session start times, while benefiting from unified Spark UI monitoring across all supported engines for consistent visibility into job execution and performance. Additionally, EMR Serverless provides VPC connectivity support for workloads requiring network isolation.
This feature is available in all AWS Regions where Amazon SageMaker Unified Studio is available, supporting both SageMaker Unified Studio notebooks and JupyterLab IDE environments. To get started, see Amazon SageMaker Unified Studio User Guide.