Posted On: Nov 17, 2023

AWS Glue for Apache Spark announces the launch of six new database connectors: Teradata, SAP HANA, Azure SQL, Azure Cosmos DB, Vertica, and MongoDB. These native connectors enable users to efficiently read and write data from these systems without the need to install or manage any connector libraries. Users can add these databases as a source or target within AWS Glue Studio's no-code, drag-and-drop visual interface or use the connector directly in an AWS Glue ETL script job.

For Teradata, SAP HANA, Azure SQL, and Vertica, users can specify a single table or enter a custom query to select their data. For MongoDB, they can specify the document collection. For Azure Cosmos DB, they can specify the container and optionally provide a custom query. When authoring visual ETL jobs, they can preview their source dataset to find the right data faster. Users can also use these databases as targets in their ETL pipelines to write the output from the transformation steps.

These capabilities enable ETL developers to work with AWS Glue and supported databases across a variety of data situations within a single interface. To get started, create a new connection within AWS Glue to your desired database and add it as source or target to your Glue ETL job.

This feature is available in all commercial AWS Regions where AWS Glue is available.

To learn more, visit the AWS Glue documentation.