Why AWS Glue?

Preparing your data to obtain quality results is the first step in an analytics or AI project. AWS Glue is a serverless service that makes data integration simpler, faster, and cheaper. You can discover and connect to more than 100 diverse data sources, manage your data in a centralized data catalog, and visually create, run, and monitor data pipelines to load data into your data lakes, data warehouses, and lakehouses. With built-in generative AI capabilities, you can modernize Apache Spark jobs and develop faster with intelligent assistance for ETL authoring and Spark troubleshooting.

Integrate your data with AWS Glue in the next generation of Amazon SageMaker

With AWS Glue in the next generation of Amazon SageMaker, you can manage and build your workloads in one place with cost-effective, serverless, and scalable data integration.

Image

Benefits

Use Cases

Simplify ETL pipeline management

Remove infrastructure management with automatic provisioning and worker management, and consolidate all your data integration needs into a single service.

Interactively explore, experiment on, and process data

Using AWS Glue interactive sessions, data engineers can interactively explore and prepare data using the integrated development environment (IDE) or notebook of their choice.

Discover data efficiently

Quickly identify data across AWS, on premises, and other clouds, and then make it instantly available for querying and transforming.

Support various processing frameworks and workloads

More easily support various data processing frameworks, such as ETL and ELT, and various workloads, including batch, micro-batch, and streaming.

Explore more of AWS