How do I configure a Data Pipeline precondition for Amazon S3?
Last updated: 2022-11-10
I want to use an AWS Data Pipeline precondition to stop a running pipeline when a specific Amazon Simple Storage Service (Amazon S3) bucket doesn't exist.
Create an S3PrefixNotEmpty precondition for the S3DataNode object. S3PrefixNotEmpty is a system-managed precondition. Unless you specify a timeout (preconditionTimeout), system-managed preconditions run until their condition is true.
Note: On active pipelines, preconditions incur additional charges. For more information, see AWS Data Pipeline pricing.
You can't configure preconditions while a pipeline is active. To add a precondition to an active pipeline, clone the pipeline and then complete the following steps:
- Open the AWS Data Pipeline console.
- Select a deactivated pipeline, and then choose Actions, Edit.
- Choose an existing S3DataNode object. Or, create a new one by choosing the Add dropdown list, and then choosing S3DataNode.
- In the Add optional field dropdown list for the S3DataNode object, choose Precondition.
- In the Precondition dropdown list, choose Create new: Precondition.
- Open the Precondition section, and then find the precondition object that you just created.
- In the Type dropdown list, choose S3PrefixNotEmpty.
- In the S3 Prefix field, enter an Amazon S3 path. For an example of how this looks in the pipeline definition JSON, see S3PrefixNotEmpty.
To specify a timeout, do the following:
- In the Add optional field dropdown list for the S3PrefixNotEmpty object, choose Precondition Timeout.
- Specify the timeout conditions.
When preconditionTimeout is met, the dependent nodes fail with the status CASCADE_FAILED.