AWS Data Pipeline uses m3.xlarge Amazon Elastic Compute Cloud (Amazon EC2) instances even when I specify a different instance type in the EmrCluster object. How can I prevent my instance type choice from being overridden?

Data Pipeline overrides your instance type choices with m3.xlarge when both of the following are true:

  • EmrActivity uses DynamoDBDataNode as either an input or output data node.
  • resizeClusterBeforeRunning is set to true.

Data Pipeline can also override the instanceCount value that you specify, which could increase your monthly costs.

To prevent this from happening, set the resizeClusterBeforeRunning parameter to false.

  1. Open the Data Pipeline console.
  2. On the List Pipelines page, choose the Pipeline ID, and then choose Edit Pipeline to open the Architect page.
  3. In the right pane, choose Activities, and then find the EmrActivity object.
  4. Set Resize Cluster Before Running to false.
  5. Choose Save.

When you activate your pipeline, Data Pipeline uses the instance type and count that you specified in the EmrCluster object.

Did this page help you? Yes | No

Back to the AWS Support Knowledge Center

Need help? Visit the AWS Support Center

Published: 2019-02-14