AWS Glue Documentation

AWS Glue is a scalable, serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.

What is AWS Glue?
Learn about AWS Glue features, how to get started, and how to access AWS Glue.
What is the AWS Glue Data Catalog?
Learn about the AWS Glue Data Catalog, which is your persistent metadata store.

Discover and organize data

Get started with the AWS Glue Data Catalog
Use this tutorial to create your first AWS Glue Data Catalog, which uses an Amazon S3 bucket as your data source.
Populate your Data Catalog with crawlers
Populate the AWS Glue Data Catalog with metadata tables from data stores that you define.
Add connections to your Data Catalog
Use AWS Glue connections to access certain types of data stores.

Author and run data integration jobs

Create AWS Glue jobs visually with AWS Glue Studio
Create jobs through AWS Glue Studio, a graphical interface that makes it easy to create, run, and monitor integration jobs.
Write an AWS Glue ETL script
Write an AWS Glue extract, transform, and load (ETL) script through this tutorial to understand how to use scripts when you're building AWS Glue jobs.
Create AWS Glue jobs with notebooks
Author interactive jobs in a notebook interface based on Jupyter notebooks in AWS Glue Studio.
Develop AWS Glue jobs locally with interactive sessions
Programmatically build and test scripts for data preparation using interactive sessions.

Automate and monitor data integration pipelines

Automate with event-based triggers
Start crawlers or AWS Glue jobs with event-based triggers. You can also design a chain of dependent jobs and crawlers.
Run and monitor your jobs
Run your AWS Glue jobs, and then monitor them with automated monitoring tools, the Apache Spark UI, AWS Glue job run insights, and AWS CloudTrail.
Automate with workflows
Define workflows for ETL and integration activities for multiple crawlers, jobs, and triggers.

User guides

AWS Glue Studio User Guide
Describes how to use the AWS Glue Studio console and the visual job editor interface to build and monitor ETL jobs.
AWS Glue Developer Guide
Provides a conceptual overview of AWS Glue, detailed instructions for using the various features, and a complete API reference for developers.
AWS Glue section of the AWS CLI Reference
Describes the AWS CLI commands that you can use with AWS Glue.
AWS Glue DataBrew Developer Guide
Describes how to prepare data visually with ready-made data transformations for analytics and machine learning. Also provides an API reference complete with instructions, syntax, and examples.

Privacy Site terms Cookie preferences

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

AWS Glue Documentation

What is AWS Glue?

What is the AWS Glue Data Catalog?

Discover and organize data

Get started with the AWS Glue Data Catalog

Populate your Data Catalog with crawlers

Add connections to your Data Catalog

Author and run data integration jobs

Create AWS Glue jobs visually with AWS Glue Studio

Write an AWS Glue ETL script

Create AWS Glue jobs with notebooks

Develop AWS Glue jobs locally with interactive sessions

Automate and monitor data integration pipelines

Automate with event-based triggers

Run and monitor your jobs

Automate with workflows

User guides

AWS Glue Studio User Guide

AWS Glue Developer Guide

AWS Glue section of the AWS CLI Reference

AWS Glue DataBrew Developer Guide