With Amazon Redshift, you can start small at $0.25 per hour and scale up to petabytes of data and thousands of concurrent users. Choose what is right for your business needs, with the ability to grow storage without over-provisioning compute, and the flexibility to grow compute capacity without increasing storage costs.
What to expect
First, learn more about node types to choose the best cluster configuration for your needs. You can quickly scale your cluster, pause and resume the cluster, and switch between node types with a single API call or a few clicks in the Redshift console. You’ll see on-demand pricing before making your selection, and later you may choose to purchase reserved nodes for significant discounts.
Once you make your selection, you may wish to use Elastic Resize to easily adjust the amount of provisioned compute capacity within minutes for steady-state processing. With Resize Scheduler, you can add and remove nodes on a daily or weekly basis to optimize cost and get the best performance. For dynamic workloads, you can use Concurrency Scaling to automatically provision additional compute capacity and only pay for what you use on a per-second basis after exhausting the free credits (see Concurrency Scaling pricing).
Amazon Redshift node types
Redshift capabilities with pay-as-you-go pricing
- Amazon Redshift node types: Choose the best cluster configuration and node type for your needs, and can pay for capacity by the hour with Amazon Redshift on-demand pricing. When you choose on-demand pricing, you can use the pause and resume feature to suspend on-demand billing when a cluster is not being used. You can also choose Reserved Instances instead of on-demand instances for steady-state workloads and get significant discounts over on-demand pricing.
- Amazon Redshift Spectrum pricing: Run SQL queries directly against the data in your Amazon S3 data lake, out to exabytes—you simply pay for the number of bytes scanned.
- Concurrency Scaling pricing: Each cluster earns up to one hour of free Concurrency Scaling credits per day, which is sufficient for 97 percent of customers. This enables you to provide consistently fast performance, even with thousands of concurrent queries and users. You simply pay a per-second on- demand rate for usage that exceeds the free credits.
- RMS pricing: Pay only for the data you store in RA3 clusters, independent of number of compute nodes provisioned. You simply pay hourly for the total amount of data in managed storage.
- Redshift ML: Use SQL to create, train, and deploy machine learning (ML) models. After you exhaust the free tier for Amazon SageMaker, you will incur costs for the creating your model and storage.
AWS Free Tier
As part of the AWS Free Tier, if your organization has never created a Redshift cluster, you’re eligible for a two-month free trial of our DC2 large node. Your organization gets 750 hours per month for free, enough hours to continuously run one DC2 large node with 160 GB of compressed SSD storage. Once your two month free trial expires or your usage exceeds 750 hours per month, you can shut down your cluster to avoid any charges, or keep it running at our standard on-demand Rate.
Amazon Redshift on-demand pricing allows you to pay for capacity by the hour with no commitments and no upfront costs. Simply pay an hourly rate based on the type and number of nodes in your cluster. Partial hours are billed in one-second increments following a billable status change such as creating, deleting, pausing, or resuming the cluster. The pause and resume feature allows you to suspend on-demand billing during the time the cluster is paused. During the time that a cluster is paused you only pay for backup storage. This frees you from planning and purchasing data warehouse capacity ahead of your needs, and enables you to cost-effectively manage environments for development or test purposes.
*Total addressable storage capacity in the managed storage with each RA3 node.
Calculating your effective on-demand price per TB per year
For on-demand, the effective price per TB per year is the hourly price for the instance, times the number of hours in a year, divided by the number of TB per instance. For RA3, data stored in managed storage is billed separately based on actual data stored in the RA3 node types; effective price per TB per year is calculated for only the compute node costs.
Amazon Redshift Spectrum pricing
Amazon Redshift Spectrum allows you to directly run SQL queries against exabytes of data in AmazonS3. You are charged for the number of bytes scanned by Redshift Spectrum, rounded up to the next megabyte, with a 10 MB minimum per query. There are no charges for Data Definition Language(DDL) statements like CREATE/ALTER/DROP TABLE for managing partitions and failed queries.
You can improve query performance and reduce costs by storing data in a compressed, partitioned, and columnar data format. If you compress data using one of Redshift Spectrum’s supported formats, your costs will go down because less data is scanned. Similarly, if you store data in a columnar format, such as Apache Parquet or Optimized Row Columnar (ORC) format, your charges will also go down because Redshift Spectrum only scans columns needed by the query.
You are charged for the Amazon Redshift cluster used to query data with Redshift Spectrum. Redshift Spectrum queries data directly in Amazon S3. You are charged standard S3 rates for storing objects in your S3 buckets, and for requests made against your S3 buckets. For details, refer to Amazon S3 rates.
If you use the AWS Glue Data Catalog with Amazon Redshift Spectrum, you are charged standard AWS Glue Data Catalog rates. For details, refer to AWS Glue pricing.
When using Amazon Redshift Spectrum to query AWS Key Management Service (KMS) encrypted data in Amazon S3, you are charged standard AWS KMS rates. For details, refer to AWS KMS pricing.
Redshift Spectrum pricing examples based on US East (N.Virginia) price
Consider a table with 100 equally sized columns stored in Amazon S3 as an uncompressed text file with a total size of 4 terabytes. Running a query to get data from a single column of the table requires Redshift Spectrum to scan the entire file, because text formats cannot be split. This query would scan 4 terabytes and cost $20. ($5/TB * 4TB = $20)
If you compress your file using GZIP, you may see a 4:1 compression ratio. In this case, you would have a compressed file size of 1 terabyte. Redshift Spectrum has to scan the entire file, but since it is one-fourth the size, you pay one-fourth the cost, or $5. ($5/TB * 1TB = $5)
If you compress your file and convert it to a columnar format like Apache Parquet, you may see a 4:1 compression ratio and have a compressed file size of 1 terabyte. Using the same query as above, Redshift Spectrum needs to scan only one column in the Parquet file. The cost of this query would be $0.05. ($5/TB * 1TB file size * 1/100 columns, or a total of 10 gigabytes scanned = $0.05).
Note: The above pricing examples are for illustration purposes only. The compression ratio of different files and columns may vary.
Concurrency Scaling pricing
Redshift automatically adds transient capacity to provide consistently fast performance, even with thousands of concurrent users and queries. There are no resources to manage, no upfront costs, and you are not charged for the startup or shutdown time of the transient clusters. You can accumulate one hour of Concurrency Scaling cluster credits every 24 hours while your main cluster is running. You are charged the per-second on-demand rate for a Concurrency Scaling cluster used in excess of the free credits—only when it's serving your queries—with a one-minute minimum charge each time a Concurrency Scaling cluster is activated. The per-second on-demand rate is based on the type and number of nodes in your Redshift cluster.
Concurrency Scaling credits
Redshift clusters earn up to one hour of free Concurrency Scaling credits per day. Credits are earned on an hourly basis for each active cluster in your AWS account, and can be consumed by the same cluster only after credits are earned. You can accumulate up to 30 hours of free Concurrency Scaling credits for each active cluster. Credits do not expire as long as your cluster is not terminated.
Pricing example for Concurrency Scaling
A 10 DC2.8XL node Redshift cluster in the US-East costs $48 per hour. Consider a scenario where two transient clusters are utilized for five minutes beyond the free Concurrency Scaling credits. The per-second on-demand rate for Concurrency Scaling is $48 * 1/3600 = $0.013 per second. The additional cost for Concurrency Scaling in this case is $0.013 per second * 300 seconds * 2 transient clusters = $8. Therefore, the total cost of the Amazon Redshift cluster and the two transient clusters in this case is $56.
Amazon Redshift managed storage pricing
You pay for data stored in managed storage at a fixed GB-month rate for your region. Managed storage comes exclusively with RA3 node types, and you pay the same low rate for Redshift managed storage regardless of data size. Usage of managed storage is calculated hourly based on the total data present in the managed storage (see example below converting usage in GB-Hours to charges in GB- Month). You can monitor the amount of data in your RA3 cluster via Amazon CloudWatch or the AWS Management Console. You do not pay for any data transfer charges between RA3 nodes and managed storage. Managed storage charges do not include back up storage charges due to automated and manual snapshots (see Backup Storage). Once the cluster is terminated, you continue to be charged for the retention of your manual backups.
Pricing example for managed storage pricing
Let’s first calculate the usage in GB-Hours for the above scenario. For the first 15 days, you will have the following usage in GB-Hours: 100 GB x 15 days x ( 24 hours/day) = 36,000 GB-Hours. For the last 15 days, you will have the following usage in GB-Hours: 100 TB X 1024 GB/TB X 15 days X ( 24 hours / day) = 36,864,000 GB-Hours
At the end of April, all usage in GB-Hours adds to: 36,000 GB-Hours + 36,864,000 GB-Hours = 36,900,000 GB-Hours
Let's convert this to GB-Months: 36,900,000 GB-Hours / 720 hours per month in April = 51,250 GB-Month.
If this data was stored in the US East (Northern Virginia) Region, managed storage will be charged at $0.024/GB-Month. Monthly storage charges for 51,250 GB-Month will be: 51,250 GB-Month x $0.024 per GB-month = $1,230
Total Managed Storage Fee for April = $1,230
Redshift ML pricing
When you use Amazon Redshift ML, the prediction functions run within your Redshift cluster, and you do not incur additional expense. However, the CREATE MODEL request uses Amazon SageMaker for model training and Amazon S3 for storage and incurs additional expense. The expense is based on the number of cells in your training data, where the number of cells is the product of the number of records (in the training query or table) times the number of columns. For example, if the SELECT query of the CREATE MODEL produces 10,000 records for training and each record has five columns, then the number of cells in the training data is 50,000.
Amazon SageMaker charges
When you get started with Redshift ML, you qualify for the Amazon SageMaker free tier if you haven’t previously used Amazon SageMaker. This includes two free CREATE MODEL requests per month for two months with up to 100,000 cells per request. Your free tier starts from the first month when you create your first model in Redshift ML.
Amazon S3 charges
The CREATE MODEL request also incurs small Amazon S3 charges. S3 costs should be less than $1 per month since the amount of S3 data generated by CREATE MODEL are in the order of a few GBs. When garbage collection is on, they are quickly removed. Amazon S3 is used first to store the training data produced by the SELECT query of the CREATE MODEL. Then it is used to store various model-related artifacts needed for prediction. The default garbage collection mode will remove both training data and model-related artifacts at the end of CREATE MODEL.Cost control options.
You can control the training cost by setting the MAX_CELLS. If you do not, the default value of MAX_CELLS is one million, which in the vast majority of cases will keep your cost of training below $20. When the training data set is above a million, the pricing increases as follows:
Cost control options
You can control the training cost by setting the MAX_CELLS. If you do not, the default value ofMAX_CELLS is one million, which in the vast majority of cases will keep your cost of training below$20. When the training data set is above a million, the pricing increases as follows:
|Number of cells||Price|
First 10M cells
$20 per million cells
Next 90M cells
$15 per million cells
Over 100M cells
$7 per million cells
Note, real pricing will often be less than the upper bounds shared above.
Examples of CREATE MODEL cost:
- 100,000 cells is $20 (= 1 x 20)
- 2,000,000 cells is $40 (= 2 x 20)
- 23,000,000 cells is $395 (= 10 x 20 + 13 x 15)
- 99,000,000 cells is $1,535 (= 10 x 20 + 89 x 15) and
- 211,000,000 cells is $2,327 (= 10 x 20 + 90 x 15 + 111 x 7)
If the training data produced by the SELECT query of the CREATE MODEL exceeds the MAX_CELLS limit you provided (or the default one million, in case you did not provide one), the CREATE MODEL will randomly choose approximately MAX_CELLS/“number of columns” records from the training dataset and will train using these randomly chosen tuples. (The random choice is designed to provide that the reduced training dataset will not have any bias.) Thus, by setting the MAX_CELLS, you can keep your cost within bound.
Reserved Instance pricing
Reserved Instances are appropriate for steady-state production workloads, and offer significant discounts over on-demand pricing. Customers typically purchase Reserved Instances after running experiments and proof-of-concepts to validate production configurations.
You can benefit from significant savings over on-demand rates by committing to use Amazon Redshift for a one- or three-year term. Reserved Instance pricing is specific to the node type purchased, and remains in effect until the reservation term ends. Prices include two additional copies of data - one on the cluster nodes and one in Amazon S3. We take care of backup, durability, availability, security, monitoring, and maintenance for you.
There are three options for Reserved Instance pricing:
No Upfront – You pay nothing upfront, and commit to pay monthly over the course of one year.
Partial Upfront – You pay a portion of the Reserved Instance upfront, and the remainder over a one- or three-year term.
All Upfront – You pay for the entire Reserved Instance term (one or three years) with one upfront payment.
Reserved Instances are a billing concept and are not used to create data warehouse clusters. When you make a purchase, you will be charged the associated upfront and monthly fees even if you are not currently running a cluster, or if an existing cluster is paused. To purchase Reserved Instances, visit the Reserved Nodes tab in our Console.
We may terminate the Reserved Instance pricing program at any time. In addition to being subject to Reserved Instance pricing, Reserved Instances are subject to all data transfer and other fees applicable under the AWS Customer Agreement or other agreement with us governing your use of our services.
* The Monthly rate below is the actual hourly rate multiplied by the average number of hours per month.
** The Effective Hourly rate below is the amortized hourly cost of the instance over the entire term, including any upfront payment.
Calculating your effective price per TB per year for Reserved Instances
For Reserved Instances, add the upfront payment to the hourly rate times the number of hours in the term, and divide by the number of years in the term and number of TB per node. For RA3, data stored in managed storage is billed separately based on actual data stored in the RA3 node types; effective price per TB per year is calculated for only the compute node costs.
Backup storage is the storage associated with the snapshots taken for your data warehouse. Increasing your backup retention period or taking additional snapshots increases the backup storage consumed by your data warehouse. Redshift charges for manual snapshots you take using the console, application programming interface (API), or command-line interface (CLI). Redshift Automated snapshots, which are created using Redshift's snapshot scheduling feature, are offered at no charge. Data stored on RA3 clusters is part of Redshift Managed Storage (RMS) and is billed at RMS rates, but manual snapshots taken for RA3 clusters are billed as backup storage at standard Amazon S3 rates outlined on this page.
For example, if your RA3 cluster has 10 TB of data and 30 TB of manual snapshots, you would be billed for 10 TB of RMS and 30 TB of backup storage. With dense compute (DC) and dense storage (DS) clusters, storage is included on the cluster and is not billed for separately, but backups are stored externally in Amazon S3. Backup storage beyond the provisioned storage size on DC and DS clusters is billed as backup storage at standard S3 rates. Snapshots are billed until they expire or are deleted, including when the cluster is paused or deleted.
There is no charge for data transferred between Amazon Redshift and Amazon S3 within the same AWS Region for backup, restore, load, and unload operations. For all other data transfers into and out of Amazon Redshift, you will be billed at standard AWS data transfer rates. In particular, if you run your Amazon Redshift cluster in Amazon Virtual Private Cloud (VPC), you will see standard AWS data transfer charges for data transfers over JDBC/ODBC to your Amazon Redshift cluster endpoint. In addition, when you use Enhanced VPC Routing and unload data to Amazon S3 in a different region, you will incur standard AWS data transfer charges. For more information about AWS data transfer rates, see the Amazon Elastic Cloud Compute (Amazon EC2) pricing page.
You use four ra3.xlarge nodes and 40 TB of RMS for a month. During the month, you also scan 20 TB of data using Redshift Spectrum and scan 20 TB of data. You use on-demand pricing.
Your charges would be calculated as follows:
- Redshift RA3 instance cost = 4 instances x $3.26 USD per hour x 730 hours in a month= $9,519.20 USD
- RMS cost = 40 TB x 1,024 GB per TB x $0.024 USD = $983.04 USD
- Redshift Spectrum cost = 20 TB x $5.00 USD = $100.00 USD
Total monthly cost: $10,602.24 USD
A 10 DC2.8XL node Redshift cluster in the US-East costs $48 per hour. Consider a scenario where two transient clusters are utilized for five minutes beyond the free Concurrency Scaling credits. The per- second on-demand rate for Concurrency Scaling is $48 * 1/3600 = $0.013 per second. The additional cost for Concurrency Scaling in this case is $0.013 per second * 300 seconds * 2 transient clusters = $8. Therefore, the total cost of the Redshift cluster and the two transient clusters is $56.
Consider a table with 100 equally sized columns stored in Amazon S3 as an uncompressed text file with a total size of 4 TB. Running a query to get data from a single column of the table requires Redshift Spectrum to scan the entire file, because text formats cannot be split. Based on Redshift Spectrum pricing for US East (N.Virginia), this query would scan 4 TB and cost $20. ($5.00/ TB * 4 TB = $20)
If you compress your file using GZIP, you may see a 4:1 compression ratio. In this case, you would have a compressed file size of 1 TB. Redshift Spectrum has to scan the entire file, but since it is one- fourth the size, you pay one-fourth the cost, or $5. ($5/ TB * 1 TB = $5)
If you compress your file and convert it to a columnar format like Apache Parquet, you may see a 4:1 compression ratio and have a compressed file size of 1 TB. Using the same query as above, Redshift Spectrum needs to scan only one column in the Parquet file. The cost of this query would be $0.05. ($5/TB * 1 TB file size * 1/100 columns, or a total of 10 gigabytes scanned = $0.05).
Note: The above pricing examples are for illustration purposes only. The compression ratio of different files and columns may vary.