Posted On: May 25, 2023

We’re pleased to announce the launch of AWS Glue 4.0 for the AWS GovCloud (US-West) Region. AWS Glue 4.0 is a new version of AWS Glue that accelerates data integration workloads in AWS. AWS Glue 4.0 upgrades the Spark engines to Apache Spark 3.3.0 and Python 3.10. Glue 4.0 gives customers the latest Spark and Python releases so they can develop, run, and scale their data integration workloads and get insights faster.

AWS Glue is a serverless, scalable data integration service that makes it simple to discover, prepare, move, and integrate data from multiple sources. AWS Glue 4.0 adds support for built-in Pandas APIs as well as support for data lake frameworks - Apache Hudi, Apache Iceberg, and Delta Lake, giving you more options for analyzing and storing your data. It upgrades connectors for native AWS Glue database sources such as RDS, MySQL, and SQLServer, which simplifies connections to common database sources. AWS Glue 4.0 also adds native support for the new Cloud Shuffle Storage Plugin for Apache Spark, which helps customers scale their disk usage during runtime. It enables Adaptive Query Execution which dynamically optimizes your queries as it runs. Finally, AWS Glue 4.0 improves the developer experience by adding more context to error messages. As with AWS Glue 3.0, customers only pay for the resources they use.

To learn more, visit our documentation.