How do I resize an Amazon Redshift cluster?
Last updated: 2021-01-06
I want to resize an Amazon Redshift cluster. How does that impact performance and billing?
There are three ways to resize an Amazon Redshift cluster:
- Elastic resize: If it's available as an option, use elastic resize to change the node type, number of nodes, or both. Note that when you only change the number of nodes, the queries are temporarily paused and connections are kept open. An elastic resize can take between 10-15 minutes. During a resize operation, the cluster is read-only.
- Classic resize: Use classic resize to change the node type, number of nodes, or both. Choose this option when you are resizing to a configuration that isn't available through elastic resize. A resize operation can take two hours or more, or last up to several days depending on your data size. During the resize operation, the source cluster is read-only.
- Snapshot, restore, and resize: To keep your cluster available during a classic resize, make a copy of the existing cluster. Then, resize the new cluster. If data is written to the source cluster after a snapshot is taken, the data must be manually copied over after the migration completes.
Resize operation speed
When a cluster is resized using elastic resize with the same node type, the operation doesn't create a new cluster. As a result, the operation completes quickly. The time required to complete a classic resize or a snapshot and restore operation might vary, depending on the following factors:
- The workload on the source cluster.
- The number and size of the tables being transferred.
- How evenly data is distributed across the compute nodes and slices.
- The node configuration in the source and target clusters.
To reduce the time required for a classic resize or a snapshot and restore operation:
- Run the table inspector script from AWS Labs to identify skewed tables. To fix skewed tables, choose an appropriate distribution key. For more information, see Amazon Redshift Engineering’s advanced table design playbook: Distribution styles and distribution keys.
- Remove unused tables. To identify unused tables, run the unscanned table summary script from AWS Labs.
Note: The unscanned table summary only shows recent history (between 2-5 days). Use the System object persistence utility to capture usage data over a longer period.
- Increase the speed of resize operations. For more information, see Top 10 performance tuning techniques for Amazon Redshift.
To check the status of your resize operation using the Amazon Redshift console, choose the Status tab on the cluster details page. The Status tab shows the average rate of transfer, the elapsed time, and the remaining time.
- During a resize operation, your table will increase or decrease in size. This behavior is expected. For more information, see Why does a table in my Amazon Redshift cluster consume more disk storage space than expected?
- If your cluster has a status of NONE in the AWS Command Line Interface (AWS CLI), then the target cluster is still being provisioned. When your target cluster is being provisioned, it hasn't copied over yet. After your target cluster is provisioned, the status changes to IN_PROGRESS.
- If you receive an error message prompting you to "Please choose a larger target cluster," then your data does not fit into the target cluster. Resize your Amazon Redshift cluster with more nodes or a different node type.
- To cancel a resize operation before it completes, choose cancel resize from the cluster list in the Amazon Redshift console. For more information, see Snapshot, restore, and resize.
Billing for resized clusters
- During the resize operation, you're billed for the clusters that are available to you. For example, during the resize operation, you're billed for the source configuration. After the resize is complete, you're no longer billed for the source configuration. Billing starts for the target configuration as soon as the cluster status changes to Available. When you use the snapshot and restore method, you'll temporarily have an additional cluster. You are billed for the additional cluster until you clean up your environment.
- When you resize smaller node types (large, xlarge) to larger node types (8xlarge), your cluster requires more storage per node. The more storage you have per node, the more metadata that is written when you run a COMMIT. Therefore, the base cost for a single commit operation is higher for larger nodes. If you run several small commit operations concurrently, you might see a decrease in performance. For better performance, group multiple changes into a single commit operation.
- If you purchased Reserved Instances, then your billing depends on the resized cluster configuration, reserved node types, and reserved node count. For more information, see How reserved nodes work.