What is a Materialized View?
A materialized view is a duplicate data table created by combining data from multiple existing tables for faster data retrieval. For example, consider a retail application with two base tables for customer and product data. The customer table contains information like the customer’s name and contact details, while the product table contains information about product details and cost. The customer table only stores the product IDs of the items an individual customer purchases. You have to cross-reference both tables to obtain product details of items purchased by specific customers. Instead, you can create a materialized view that stores customer names and the associated product details in a single temporary table. You can build index structures on the materialized view for improved data read performance.
What are the benefits of materialized views?
Materialized views are a fast and efficient method of accessing relevant data. They help with query optimization in data-intensive applications. We go through some of the major benefits next.
Speed
Read queries scan through different tables and rows of data to gather the necessary information. With materialized views, you can query data directly from your new view instead of having to compute new information every time. The more complex your query is, the more time you will save using a materialized view.
Data storage simplicity
Materialized views allow you to consolidate complex query logic in one table. This makes data transformations and code maintenance easier for developers. It can also help make complex queries more manageable. You can also use data subsetting to decrease the amount of data you need to replicate in the view.
Consistency
Materialized views provide a consistent view of data captured at a specific moment. You can configure read consistency in materialized views and make data accessible even in multi-user environments where concurrency control is essential.
Materialized views also provide data access even if the source data changes or is deleted. Over time, this means that you can use materialized views to report on time-based data snapshots. The level of isolation from source tables ensures that you have a greater degree of consistency across your data.
Improved access control
You can use a materialized view to control who has access to specific data. You can filter information for users without giving them access to the source tables. This approach is practical if you want to control who has access to what data and how much of it they can see and interact with.
What are the use cases of materialized views?
You can benefit from materialized views in many different scenarios.
Distribute filtered data
If you need to distribute recent data across many locations, like for a remote workforce, materialized views help. You replicate and distribute data to many sites using materialized views. The people needing access to data interact with the replicated data store closest to them geographically.
This system allows for concurrency and decreases network load. It’s an effective approach with read-only databases.
Analyze time series data
Materialized views provide timestamped snapshots of datasets, so you can model information changes over time. You can store precomputed aggregations of data, like monthly or weekly summaries. These uses are helpful for business intelligence and reporting platforms.
Remote data interaction
In distributed database systems, you can use materialized views to optimize queries involving data from remote servers. Rather than repeatedly fetching data from a remote source, you can fetch and store data in a local materialized view. This reduces the need for network communication and improving performance.
For example, if you receive data from an external database or through an API, a materialized view consolidates and helps process it.
Periodic batch processing
Materialized views are helpful for situations where periodic batch processing is required. For instance, a financial institution might use materialized views to store end-of-day balances and interest calculations. Or they might store portfolio performance summaries, which can be refreshed at the end of each business day.
How do materialized views work?
Materialized views work by precomputing and storing the results of a specific query as a physical table in the database. The database performs the precomputation at regular intervals, or users can trigger it by specific events. Administrators monitor the performance and resource utilization of materialized views to ensure they continue to meet their intended purpose.
Here's a general overview of how materialized views work.
Create materialized view
You define a query that retrieves the desired data from one or more source tables for creating materialized views. This query may include filtering, aggregations, joins, and other operations as needed.
The database initially populates the materialized view by running the defined query against the source data. The result of the query is stored as a physical table in the database, and this table represents the materialized view.
Update materialized view
The data in a materialized view needs to be periodically updated to reflect changes in the underlying data in the source tables. The data refresh frequency depends on the use case and requirements.
Next, we explain a few common approaches for data refresh.
Full refresh
The materialized view is completely recomputed and overwritten with the latest query results. It is the simplest approach but can be resource-intensive, especially for large materialized views.
Incremental refresh
Only the changes in the underlying data are applied to the materialized view. It can be more efficient than a full refresh when dealing with large datasets and frequent updates.
On-demand refresh
Some systems allow materialized views to be refreshed on demand, triggered by specific events or user requests. This gives more control over when the data is updated, but it requires careful management to ensure the materialized view remains up-to-date.
Technical variations in different systems
Each database management system has distinct methods for creating a materialized view.
Database management system |
How materialized views work |
PostgreSQL |
With PostgreSQL, you have to manually refresh the materialized view, recomputing the entire view. You populate the materialized view with data at the exact moment you create it. |
MySQL |
MySQL doesn’t support materialized views. |
Oracle |
Oracle automatically refreshes materialized views, but you also have the option to refresh on demand. You can also write a SQL statement that prompts the views to refresh before delivering results. |
SQL Server |
SQL Server uses the name “indexed views,” as materialization is a step of creating an index of a regular view. You can only perform basic SQL queries with their indexed views. They update automatically for the user. |
MongoDB |
MongoDB uses aggregation functions to deliver a similar capability to materialized views but for a NoSQL environment. |
What’s the difference between views and materialized views?
In relational databases, a view is a temporary table created by transforming and combining the data across multiple base tables. It’s a virtual table that does not store any data itself. Instead, it’s defined by a query against one or more source tables.
Whenever a user queries the view, the database engine dynamically computes the results by running the underlying query against the source tables. The data in a view is always up-to-date because it’s derived directly from the source tables each time it’s accessed.
A materialized view, on the other hand, stores the results of a specific query as a physical table in the database. The data in the materialized view is precomputed and stored, meaning that the results are already available without the need to recompute the query each time the view is accessed.
However, the data in materialized views is not always up-to-date. You have to configure the update frequency to balance between data freshness and query performance.
What are the challenges with materialized views?
As materialized views are another database component to consider, you add another layer of complexity in terms of maintenance. You must balance the query and efficiency benefits with potential storage costs and data consistency issues.
You have to create effective rules that trigger updates to ensure your materialized views remain beneficial. Frequently updating your materialized views may impact system performance, especially if you are already in a peak period. Additionally, materialized views also take up a significant amount of space as they replicate data. If you have a large database that constantly updates, the storage demands of materialized views will likely be significant.
If you are going to use a materialized view, you need to set clear refresh rules and schedules. You must also understand how to deal with data inconsistencies, refresh failures, and the added storage strain.
How can AWS help with your materialized view requirements?
Materialized views are a powerful tool to improve query performance in Amazon Redshift.
Amazon Redshift continually monitors the workload using machine learning and creates new materialized views when they are beneficial. This Automated Materialized Views (AutoMV) feature in Redshift provides the same performance benefits of user-created materialized views.
The AutoMV feature can benefit you in many ways:
- Balance the costs of creating and keeping materialized views up-to-date against expected benefits to query latency
- Monitor previously created AutoMVs and drop them when they are no longer beneficial
- Refresh automatically and incrementally, using the same criteria and restrictions as user-created materialized views
Additionally, developers don't need to revise queries to take advantage of AutoMV. Automatic query rewriting to use materialized views identifies queries that can benefit from system-created AutoMVs. It automatically rewrites those queries to use the AutoMVs, improving query efficiency.
Get started with materialized views on AWS by creating an account today.