Posted On: Oct 19, 2023

AWS Glue for Apache Spark now supports native connectivity to Google BigQuery, which enables users to efficiently read and write data from BigQuery without the need to install or manage BigQuery connector for Apache Spark libraries. Users can now add BigQuery as a source or target within AWS Glue Studio's no-code, drag-and-drop visual interface or use the connector directly in an AWS Glue ETL job script. When combined with AWS Glue's ETL (Extract, Transform, Load) capabilities, this new connector simplifies the creation of ETL pipelines enabling ETL developers to save time building and maintaining data pipelines.

To get started, create a new Google BigQuery connection within AWS Glue Data Catalog and add a BigQuery source or target to your Glue ETL job. When reading from BigQuery, developers can choose a BigQuery table directly as a source or use BigQuery SQL to define a custom source. When writing to BigQuery, users can reuse existing BigQuery connections or create new ones for use as a target. These capabilities enable ETL developers to work with BigQuery and AWS Glue across a variety of scenarios.

This feature is available in all commercial AWS Regions where AWS Glue is available.

To learn more, visit the AWS Glue documentation