AWS Glue Streaming ETL jobs support schema detection and evolution

Posted on: Oct 8, 2020

Streaming extract, transform, and load (ETL) jobs in AWS Glue can now automatically detect the schema of incoming records and gracefully handle schema changes on a per-record basis. Previously, you needed to specify the schema of incoming data using the AWS Glue Data Catalog and update ETL scripts to handle schema changes. The AWS Glue job can now do both for you, saving time on reworking code and increasing the flexibility of your ETL jobs.

AWS Glue streaming ETL jobs continuously consume data from streaming sources, clean and transform the data in-flight, and make it available for analysis in seconds. Automatic schema detection in AWS Glue streaming ETL jobs makes it easy to process data like IoT logs that may not have a static schema without losing data. It also allows you to update output tables in the AWS Glue Data Catalog directly from the job as the schema of your streaming data evolves.

Automatic schema detection is available in the same AWS regions as AWS Glue.

To learn more, read about Streaming ETL in AWS Glue in our documentation.