Analytics on AWS

Fastest way to get answers from all your data to all your users
Scalable Data Lakes
Tens of thousands of customers run their data lakes on AWS. Setting up and managing data lakes today involves a lot of manual and time-consuming tasks. AWS Lake Formation automates these tasks so you can build and secure your data lake, in days instead of months. For your data lake storage, Amazon S3 is the best place to build a data lake because of its unmatched 11 nine of durability and 99.99% availability; the best security, compliance, and audit capabilities with object level audit logging and access control; the most flexibility with five storage tiers; and the lowest cost with pricing that starts at less than $1 per TB per month.
Purpose-built Analytics Services
AWS gives you the broadest and deepest portfolio of purpose-built analytics services optimized for your unique analytics use cases. These services are all designed to be the best in class, which means you never have to compromise on performance, scale, or cost when using them. For example, Amazon Redshift is 3x faster and at least 50% less expensive than other cloud data warehouses. Spark on Amazon EMR runs 1.7x faster than standard Apache Spark 3.0 and you can run Petabyte-scale analysis at less than half of the cost of traditional on-premises solutions.
Seamless Data Movement
As the data in your data lakes and purpose-built data stores continues to grow, you often times need to be able to easily move a portion of that data from one data store to another. AWS makes it easy for you to combine, move, and replicate data across multiple data stores and your data lake. For example, AWS Glue provides comprehensive data integration capabilities that make it easy to discover, prepare, and combine data for analytics, machine learning, and application development, while Amazon Redshift can easily query data in your S3 data lake. No other analytics provider makes it as easy for you to move your data, at scale, to where you need it the most.
Unified Governance
One of the most important pieces of a modern analytics architecture is the ability for customers to authorize, manage, and audit access to data. This can be challenging, because managing security, access control, and audit trails across all of the data stores in your organization is complex, time-consuming, and error-prone. With capabilities like centralized access control and policies, and column level filtering of data, no other analytics provider gives you the governance capability to manage access to all of your data across your data lake and your purpose-built data stores from a single place.
Performant and Cost-effective
AWS is committed to providing the best performance at the lowest cost across all analytics services and we are continually innovating to improve the price-performance of our services. In addition to industry-leading price performance for analytics services, S3 intelligent tiering saves customers up to 70% on storage cost for data stored in your data lake, and Amazon EC2 provides access to an industry leading choice of over 200 instance types, up to 100Gbps network bandwidth, and the ability to choose between on-demand, reserved, and spot instances.

AWS Analytics services

Category
Use cases
AWS service
Analytics
Interactive analytics

Amazon Athena

Query data in S3 using SQL.

Big data processing

Amazon EMR

Hosted Hadoop framework.

Data warehousing

Amazon Redshift

Fast, simple, cost-effective data warehousing.

Real-time analytics

Amazon Kinesis

Analyze real-time video and data streams.

Operational analytics

Amazon Elasticsearch Service

Run and scale Elasticsearch clusters.

Dashboards and visualizations

Amazon QuickSight

Fast business analytics service.

Visual data preparation

AWS Glue DataBrew

Clean and normalize data up to 80% faster.

Data movement
Real-time data movement

Amazon Managed Streaming for Apache Kafka (MSK)

Fully managed, highly available, and secure Apache Kafka service

Amazon Kinesis Video Streams

Capture, process, and store video streams for analytics and machine learning.

Amazon Kinesis Data Firehose

Prepare and load real-time data streams into data stores and analytics tools.

Amazon Kinesis Data Streams

Collect streaming data, at scale, for real-time analytics.

Data lake
Object storage

Amazon S3

Object storage built to store and retrieve any amount of data from anywhere.

AWS Lake Formation

Build a secure data lake in days.

Backup and archive

Amazon S3 Glacier

Low-cost archive storage in the cloud.

AWS Backup

Centralized backup across AWS services.

Data catalog

AWS Glue

Prepare and load data.

AWS Lake Formation

Build a secure data lake in days.

Third-party data

AWS Data Exchange

Find and subscribe to third-party data in the cloud.

Predictive analytics and machine learning
Frameworks and interfaces

AWS Deep Learning AMIs

Deep learning on Amazon EC2.

Platform services

Amazon SageMaker

Build, train, and deploy machine learning models at scale.

AWS Analytics services

Category Use cases AWS service
Analytics Interactive analytics Amazon Athena
Big data processing Amazon EMR
Data warehousing Amazon Redshift
Real-time analytics Amazon Kinesis Data Analytics
Operational analytics Amazon Elasticsearch Service
Dashboards and visualizations Amazon QuickSight
Visual data preparation Amazon Glue DataBrew
Data movement Real-time data movement Amazon Managed Streaming for Apache Kafka (Amazon MSK) | Amazon Kinesis Data Streams | Amazon Kinesis Data Firehose | Amazon Kinesis Video Streams | AWS Glue
Data lake Object storage Amazon S3 | AWS Lake Formation
Backup and archive Amazon S3 Glacier | AWS Backup
Data catalog
AWS Glue | AWS Lake Formation
Third-party data AWS Data Exchange
Predictive Analytics and Machine Learning Frameworks and interfaces AWS Deep Learning AMIs
Platform services Amazon SageMaker

Use cases

Page-Illo_Data-warehousing
Data warehousing

Run SQL and complex, analytic queries against structured and unstructured data in your data warehouse and data lake, without the need for unnecessary data movement.

Try Amazon Redshift »
Page-Illo_Big-data-processing
Big data processing

Quickly and easily process vast amounts of data in your data lake or on-premises for data engineering, data science development, and collaboration.

Try Amazon EMR »
Page-Illo_Real-time-analytics
Real time analytics

Collect, process, and analyze streaming data, and load data streams directly into your data lakes, data stores, and analytics services so you can respond in real time.

Try Amazon MSK » Try Amazon Kinesis »
Page-Illo_Data-visualization
Operational analytics

Search, explore, filter, aggregate, and visualize your data in near real time for application monitoring, log analytics, and clickstream analytics.

Try Amazon Elasticsearch Service »

Customers

JD-Power_Logo_@1x

"We built a 120TB data lake in Amazon S3, with 1500 different schemes and use AWS analytics services like Glue, Redshift, and Athena extensively. We couldn’t get these insights from a bunch of siloed databases and warehouses - we needed an S3 scale data lake."

- Bernardo Rodriguez
Chief Digital Officer, J.D. Power

netflix
Chick-fil-A_Logo
3M Company_Logo
280x100_Georgia-Pacific_Logo
Pinterest_Customer-Reference_Logo
TMobile_Logo_@1x
gt-customer_landing_page_graphics166x_epic
Adobe_Customer-Reference_Logo
Pfizer
View all customers »

Additional resources

AWS Data Lab

Create tangible deliverables that accelerate your data and analytics modernization initiatives. AWS Data Lab is a four-day intensive engagement between your team of builders and AWS technical resources.

Learn more »

Newsletter

Want to stay in the loop on educational content, upcoming events, and other innovations from AWS Analytics?

Subscribe to the AWS Analytics Newsletter »