Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

AWS Glue Documentation

AWS Glue is a scalable, serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.
  1. Learn about AWS Glue features, how to get started, and how to access AWS Glue.
    • Learn about the AWS Glue Data Catalog, which is your persistent metadata store.

      Discover and organize data

      1. Use this tutorial to create your first AWS Glue Data Catalog, which uses an Amazon S3 bucket as your data source.
        • Populate the AWS Glue Data Catalog with metadata tables from data stores that you define.
          • Use AWS Glue connections to access certain types of data stores.

            Author and run data integration jobs

            1. Create jobs through AWS Glue Studio, a graphical interface that makes it easy to create, run, and monitor integration jobs.
              • Write an AWS Glue extract, transform, and load (ETL) script through this tutorial to understand how to use scripts when you're building AWS Glue jobs.
                • Author interactive jobs in a notebook interface based on Jupyter notebooks in AWS Glue Studio.
                  • Programmatically build and test scripts for data preparation using interactive sessions.

                    Automate and monitor data integration pipelines

                    1. Start crawlers or AWS Glue jobs with event-based triggers. You can also design a chain of dependent jobs and crawlers.
                      • Run your AWS Glue jobs, and then monitor them with automated monitoring tools, the Apache Spark UI, AWS Glue job run insights, and AWS CloudTrail.
                        • Define workflows for ETL and integration activities for multiple crawlers, jobs, and triggers.

                          User guides

                          1. Describes how to use the AWS Glue Studio console and the visual job editor interface to build and monitor ETL jobs.
                            • Provides a conceptual overview of AWS Glue, detailed instructions for using the various features, and a complete API reference for developers.
                              • Describes the AWS CLI commands that you can use with AWS Glue.
                                • Describes how to prepare data visually with ready-made data transformations for analytics and machine learning. Also provides an API reference complete with instructions, syntax, and examples.
                                  PrivacySite termsCookie preferences
                                  © 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.