Amazon Redshift Capabilities

Deliver unmatched price performance at scale with SQL for your data lakehouse

Achieve exceptional price performance, scalability, and security

RA3 instances maximize the speed for performance-intensive workloads that require large amounts of compute capacity, with the flexibility to pay for compute resources separately from storage by specifying the number of instances you need.

Columnar storage, data compression, and zone maps reduce the amount of I/O needed to perform queries. Along with industry-standard encodings such as LZO and Zstandard, Amazon Redshift also offers purpose-built compression encoding, AZ64, for numeric and date and time types, to provide both storage savings and optimized query performance.

Supports virtually unlimited concurrent users and concurrent queries with consistent service levels by adding transient capacity in seconds as concurrency increases. Scale with minimal cost impact, as each cluster earns up to one hour of free concurrency scaling credits per day. These free credits are sufficient for the concurrency needs of 97% of customers.

Supports virtually unlimited concurrent users and concurrent queries with consistent service levels by adding transient capacity in seconds as concurrency increases. Scale with minimal cost impact, as each cluster earns up to one hour of free concurrency scaling credits per day. These free credits are sufficient for the concurrency needs of 97% of customers.

Amazon Redshift materialized views allow you to achieve significantly faster query performance for iterative or predictable analytics workloads, such as dashboarding and queries from business intelligence (BI) tools and extract, transform, and load (ETL) data-processing jobs. You can use materialized views to store and manage precomputed results of a SELECT statement that can reference one or more tables, including data lake, zero-ETL, and data sharing tables. With incremental refresh, Amazon Redshift identifies changes in the base table or tables that happened after the previous refresh and updates only the corresponding records in the materialized view. Incremental refresh runs more quickly than a full refresh and improves workload performance.

Deliver subsecond response times for repeat queries. Dashboard, visualization, and BI tools that run repeat queries experience a significant performance boost. When a query runs, Amazon Redshift searches the cache to see if there is a cached result from a prior run. If a cached result is found and the data has not changed, the cached result is immediately returned instead of rerunning the query.

A new powerful table-sorting mechanism that improves performance of repetitive queries by automatically sorting data based on the incoming query filters (for example, sales in a specific region). This method significantly accelerates the performance of table scans compared to traditional methods.

Expand recovery capabilities by reducing recovery time and guaranteeing capacity to automatically recover with no data loss. An Amazon Redshift Multi-AZ data warehouse maximizes performance and value by delivering high availability without having to use standby resources, elevating your availability to 99.99% SLA.

Amazon Redshift lets you configure firewall rules to control network access to your data warehouse cluster. You can run Amazon Redshift inside Amazon Virtual Private Cloud (Amazon VPC) to isolate your data warehouse cluster in your own virtual network and connect it to your existing IT infrastructure using an industry-standard encrypted IPsec VPN.

With just a few parameter settings, you can set up Amazon Redshift to use TLS to secure data in transit, and hardware-accelerated AES-256 encryption for data at rest. If you choose to enable encryption of data at rest, all data written to disk will be encrypted, as well as any backups. Amazon Redshift takes care of key management by default.

Integration with IAM Identity Center allows organizations to support trusted identity propagation between Amazon Redshift, Amazon QuickSight, and AWS Lake Formation. You can use your organization’s identity to access Amazon Redshift in a single sign-on experience using third-party identity providers (IdP), such as Microsoft Entra ID, Okta, Ping, or OneLogin, from QuickSight and Amazon Redshift Query Editor and third-party BI tools and SQL editors. Administrators can use third-party IdP users and groups to manage fine-grained access to data across services and audit user-level access in AWS CloudTrail. With trusted identity propagation, a user’s identity is passed seamlessly between QuickSight, Amazon Redshift, and Lake Formation, reducing time to insights and enabling a friction-free analytics experience.

Granular row- and column-level security controls ensure that users see only the data they should have access to. Amazon Redshift is integrated with Lake Formation, ensuring that column-level access controls in Lake Formation are also enforced for Amazon Redshift queries on the data in the data lake. Amazon Redshift data sharing supports centralized access control with Lake Formation to simplify governance of data shared from Amazon Redshift. Lake Formation is a service that makes it easier to set up secure data lakes, to centrally manage granular access to data across all consuming services, and to apply row-level and column-level controls. With dynamic data masking, protect your sensitive data by limiting how much identifiable data is visible to users. Define multiple levels of permissions on these fields so different users and groups can have varying levels of data access without having to create multiple copies of data, all through the familiar SQL interface of Amazon Redshift.

Unlock insights with SQL across unified data in the lakehouse

Analyze all of your unified data using SQL with the Amazon Redshift integration with SageMaker Lakehouse. Query Amazon Simple Storage Service (Amazon S3) data in open formats, removing data movement between lakes and warehouses. Open your Amazon Redshift data in SageMaker Lakehouse to enable access across AWS and Apache Iceberg analytics tools, supporting comprehensive data analysis and machine learning (ML).

Amazon Redshift supports read-only queries using familiar ANSI SQL on Apache Iceberg, Apache Hudi, and Delta Lake table formats, and querying open file formats including Apache Parquet, ORC, Avro, JSON, and CSV directly in Amazon S3. Apache Iceberg is an example of an open source table format that provides transactional consistency and enhanced organization of data lakes through its table structure. Amazon Redshift Spectrum lets you read from tables, and data in open data formats like Parquet, in your data lake while keeping up to exabytes of structured, semistructured, and unstructured data in Amazon S3. You can also export data to your data lake using the Amazon Redshift UNLOAD command, including the option to export to Parquet. Exporting data from Amazon Redshift back to your data lake lets you analyze the data further with AWS services such as Amazon Athena, Amazon EMR, and SageMaker.

Use SQL to make your Amazon Redshift data and data lake more accessible to data analysts, data engineers, and other SQL users with a web-based analyst workbench for data exploration and analysis. Query Editor lets you visualize query results in a single step, create schemas and tables, visually load data, and browse database objects. It also provides an intuitive editor for authoring and sharing SQL queries, analyses, visualizations, and annotations and securely sharing them with your team.

Use the built-in SQL editor powered by Amazon Redshift in SageMaker Unified Studio, one data and AI development environment, to query data stored in data lakes, data warehouses, databases, and applications.

Accelerate decision-making with near real-time analytics

No-code integration between Aurora, Amazon Relational Databse Service (Amazon RDS), Amazon DynamoDB, enterprise applications, and Amazon Redshift enables immediate analytics and ML on petabytes of data across databases and applications. For example, for data written to operational, transactional, or enterprise application sources, Aurora zero-ETL integrations with Amazon Redshift seamlessly make the data available in Amazon Redshift, minimizing the need for you to build and maintain complex ETL data pipelines.

Simplify and automate data ingestion from Amazon S3, reducing the time and effort to build custom solutions or manage third-party services. With this feature, Amazon Redshift removes the need for manually and repeatedly running copy procedures by automating file ingestion and taking care of continual data-loading steps under the hood. Support for auto-copy makes it easier for line-of-business users and data analysts without any data engineering knowledge to create ingestion rules and configure the location of the data they wish to load from Amazon S3.

Use SQL to connect to and directly ingest data from Amazon Kinesis Data Streams and Amazon Managed Streaming for Apache Kafka (Amazon MSK). Amazon Redshift Streaming Ingestion also makes it easier to create and manage downstream pipelines by letting you directly create materialized views on top of streams. The materialized views can also include SQL transformations as part of your ELT pipeline. You can manually refresh defined materialized views to query the most recent streaming data.

Query live data across one or more Amazon RDS instances, including the Amazon Aurora PostgreSQL-Compatible Edition, Amazon Relational Database (Amazon RDS) for MySQL, and Amazon Aurora MySQL-Compatible Edition databases, to get instant visibility into full business operations without requiring data movement.

Get easy SQL analytics without managing infrastructure

Run analytics in seconds and scale without the need to set up and manage data warehouse infrastructure. AI-driven scaling and optimization technology (available in preview) enables Amazon Redshift Serverless to automatically and proactively provision and scale data warehouse capacity, delivering fast performance for even the most demanding workloads. The system uses AI techniques to learn customer workload patterns across key dimensions, such as concurrent queries, query complexity, influx of data volume, and ETL patterns. It then continually adjusts resources throughout the day and applies tailored performance optimizations. You can set a desired performance target, and the data warehouse automatically scales to maintain consistent performance.

Sophisticated algorithms predict and classify incoming queries based on their run times and resource requirements to dynamically manage performance and concurrency while also helping you prioritize your business-critical workloads. Short query acceleration (SQA) sends short queries from applications such as dashboards to an express queue for immediate processing rather than being starved behind large queries. Automatic workload management (WLM) uses ML to dynamically manage memory and concurrency, helping maximize query throughput. In addition, you can now set the priority of your most important queries, even when hundreds of queries are being submitted. Amazon Redshift Advisor makes recommendations when an explicit user action is needed to further turbocharge Amazon Redshift performance. For dynamic workloads where query patterns are not predictable, automated materialized views improve throughput of queries, lower query latency, and shorten execution time through automatic refresh, auto query rewrite, incremental refresh, and continual monitoring of Amazon Redshift clusters. Automatic table optimization selects sort and distribution keys to optimize performance for the cluster’s workload. If Amazon Redshift determines that applying a key will improve cluster performance, tables will be automatically altered without requiring administrator intervention. The additional features automatic vacuum delete, automatic table sort, and automatic analyze remove the need for manual maintenance and tuning of Amazon Redshift clusters to get the best performance for new clusters and production workloads.

Use a straightforward API to interact with Amazon Redshift: Amazon Redshift lets you painlessly access data with all types of traditional, cloud-native, and containerized, serverless web services–based and event-driven applications. The Amazon Redshift Data API simplifies data access, ingestion, and egress from programming languages and platforms supported by the AWS SDK, such as Python, Go, Java, Node.js, PHP, Ruby, and C++. The Data API removes the need for configuring drivers and managing database connections. Instead, you can run SQL commands to an Amazon Redshift cluster by calling a secured API endpoint provided by the Data API. The Data API takes care of managing database connections and buffering data. The Data API is asynchronous, so you can retrieve your results later. Your query results are stored for 24 hours.

Run queries within the console or connect SQL client tools, libraries, or data science tools including QuickSight, Tableau, Microsoft Power BI, Alteryx, Querybook, Jupyter Notebook, Informatica, dbt, MicroStrategy, and Looker.

Contextualize applications and boost user productivity with generative AI

Use plain English to write queries in Amazon Redshift Query Editor that securely generate accurate SQL code recommendations within the scope of your data access permissions.

Amazon Redshift seamlessly integrates with Amazon Bedrock, enabling direct generative AI capabilities through standard SQL commands. This integration allows data teams to use foundation models like Anthropic Claude and Amazon Titan for tasks such as text analysis, translation, and sentiment detection without additional infrastructure complexity. Users can seamlessly invoke AI models within their existing data analytics workflows, transforming how insights are extracted from enterprise data.

Amazon Redshift ML makes it easier for data analysts, data scientists, BI professionals, and developers to create, train, and deploy SageMaker models using SQL. With Amazon Redshift ML, you can use SQL statements to create and train SageMaker models on your data in Amazon Redshift and then use those models for predictions such as churn detection, financial forecasting, personalization, and risk scoring directly in your queries and reports. Bring large language models into Amazon Redshift for advanced natural language processing tasks like text summarization, entity extraction, and sentiment analysis, to gain deeper insights from your data using SQL.