简短描述
-----



要筛选 AWS Glue Data Catalog 中的分区，请使用[谓词下推](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-partitions.html#aws-glue-programming-etl-partitions-pushdowns)。与 **Filter**（筛选）转换不同的是，谓词下推让您可以筛选分区，无需列出和读取您的数据集中的所有文件。



 解决方法
-----



[创建 AWS Glue 作业](https://docs.aws.amazon.com/glue/latest/dg/console-jobs.html)并在 **DynamicFrame** 中指定谓词下推。在以下示例中，作业仅处理 **s3://awsexamplebucket/product\_category=Video** 分区中的数据： 





```plaintext
datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "testdata", table_name = "sampletable", transformation_ctx = "datasource0",push_down_predicate = "(product_category == 'Video')")
```



在以下示例中，按日期筛选谓词下推。作业仅处理 **s3://awsexamplebucket/year=2019/month=08/day=02** 分区中的数据：





```plaintext
datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "testdata", table_name = "sampletable", transformation_ctx = "datasource0",push_down_predicate = "(year == '2019' and month == '08' and day == '02')")
```



在以下示例中，按日期筛选非 Hive 样式分区的谓词下推。作业仅处理 **s3://awsexamplebucket/2019/07/03** 分区中的数据：





```plaintext
datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "testdata", table_name = "sampletable", transformation_ctx = "datasource0",push_down_predicate ="(partition_0 == '2019' and partition_1 == '07' and partition_2 == '03')" )
```




---








我想要在 Amazon Simple Storage Service（Amazon S3）位置中的一个特定分区上运行 AWS Glue 作业。

在特定的 Amazon S3 分区上运行 AWS Glue 作业

如何在 Amazon S3 中的一个特定分区上运行 AWS Glue 作业？

简短描述

解决方法

相关内容