Skip to main content

AWS Glue

AWS Glue DataBrew

Clean and normalize data up to 80% faster

Introducing AWS Glue DataBrew

AWS Glue DataBrew is a visual data preparation tool that makes it easier for data analysts and data scientists to clean and normalize data to prepare it for analytics and machine learning (ML). You can choose from over 250 prebuilt transformations to automate data preparation tasks, all without the need to write any code. You can automate filtering anomalies, converting data to standard formats and correcting invalid values, and other tasks. After your data is ready, you can immediately use it for analytics and ML projects. You only pay for what you use—no upfront commitment.

Capabilities

Profile

Screenshot of an AWS analytics dashboard displaying column statistics for a sample dataset, including metrics on data quality, value distribution, correlations, and unique values for the 'start station id' column. Evaluate the quality of your data by profiling it to understand data patterns and detect anomalies; connect data directly from your data lake, data warehouses, and databases.

Clean and normalize

Screenshot of an AWS data analytics tool interface displaying the process of merging latitude and longitude columns into a new 'latlong' column within a Citi Bike dataset. Shows summary statistics, column merging options, and rows of geospatial data. Choose from over 250 built-in transformations to visualize, clean, and normalize your data with an interactive point-and-click visual interface.

Map data lineage

Screenshot of the AWS DataBrew interface displaying a data lineage diagram for the 'nycCitibikes' project. The diagram visualizes datasets, a project, recipe, job, and S3 sources and outputs in a data workflow, including joins and a running job output to parquet format. Visually map the lineage of your data to understand the various data sources and transformation steps that the data has been through.

Automate

Screenshot of the AWS Glue DataBrew interface displaying the 'Create recipe job' workflow, including recipe selections and job input settings for dataset transformation. Automate data cleaning and normalization tasks by applying saved transformations directly to new data as it comes into your source system.
The logo of NTT Docomo, featuring the brand name in red text on a white background.
AWS Glue DataBrew provides a visual interface that enables both our technical and nontechnical users to analyze data quickly and easily. Its advanced data profiling capability helps us better understand our data and monitor the data quality.

Takashi Ito

General Manager of Marketing Platform Planning Department, NTT DOCOMO
Invista company logo featuring stylized text and abstract orange and red icon on a white background.
AWS Glue DataBrew will allow our data analysts to visually inspect large datasets, clean and enrich data, and perform advanced transformations. It will empower our analysts and data scientists to perform advanced data engineering activities, giving them the freedom to explore their data while decreasing the time to derive new insights.

Tanner Gonzalez

Analytics and Cloud leader, INVISTA
The BP logo featuring a green and yellow sunburst pattern with the letters 'bp' above it, representing the British multinational energy company.
AWS Glue DataBrew has sophisticated data profiling capabilities and a rich set of built-in transformations. This will enable our data engineers to easily explore new datasets in a visual interface, and allow analysts to shape the data for their analytics solutions.

John Maio

Director, Data & Analytics Platforms Architecture, bp