Data and analytics have become indispensable to businesses to stay competitive. Businesses use reports, dashboards, and analytics tools to extract insights from their data, monitor business performance, and support decision making. These reports, dashboards and analytics tools are powered by data warehouses, which store data efficiently to minimize I/O and deliver query results at blazing speeds to hundreds and thousands of users concurrently.
Download the whitepaper: Enterprise Data Warehousing on AWS
The data warehouse functions as a central repository of information coming from one or more data sources. Data flows into a data warehouse from transactional systems and other relational databases, and typically includes structured, semi-structured, and unstructured data. This data is processed, transformed, and ingested at a regular cadence. Users including data scientists, business analysts, and decision-makers access the processed data in the data warehouse through business intelligence tools, SQL clients, and spreadsheets.
|Data Warehouse||Transactional Database|
|Suitable workloads||Analytics, Big Data||Transaction processing|
|Types of operations||Optimized for batched write operations and reading high volumes of data to minimize I/O and maximize data throughput||Optimized for continuous write operations and high volumes of small read operations to maximize transaction throughput|
|Data normalization||Employ denormalized schemas like the Star schema and Snowflake schema||Employ highly normalized schemas, which are more suited for high transaction throughput requirements|
|Storage||Requires columnar or other specialized storage||Row-oriented databases that store whole rows in a physical block|
AWS allows you to take advantage of all of the core benefits associated with on-demand computing, such as access to seemingly limitless storage and compute capacity, and the ability to scale your system in parallel with the growing amount of data collected, stored and queried, paying only for the resources you provision. Further, AWS offers a AWS offers a broad set of services managed services that integrated seamlessly with each other so that you can quickly deploy an end-to-end analytics and data warehousing solution.
The following illustration shows the key steps of an end-to-end analytics process chain and the managed services available on AWS for each step:
Amazon Redshift is a fast, easy-to-use, fully managed data warehousing solution. It automates infrastructure provisioning and administrative tasks such as backups, replication, and patching. It integrates seamlessly with 3rd party BI and ETL tools, so you can get to your first report in just a few minutes. And, there is no limit to the amount of data you can load and analyze. As your data grows, you don’t have to worry about expensive system upgrades or slow performance. Redshift is fast at any scale because it uses columnar storage and a whole bunch of optimization. Amazon Redshift is also cost-effective and you only pay for what you use. You can have unlimited number of users doing unlimited analytics on all your data for just $1000 per terabyte per year. Learn More