AWS Big Data Blog
Category: Analytics
How Comcast uses AWS to rapidly store and analyze large-scale telemetry data
This blog post is co-written by Russell Harlin from Comcast Corporation. Comcast Corporation creates incredible technology and entertainment that connects millions of people to the moments and experiences that matter most. At the core of this is Comcast’s high-speed data network, providing tens of millions of customers across the country with reliable internet connectivity. This […]
Use ML predictions over Amazon DynamoDB data with Amazon Athena ML
Today’s modern applications use multiple purpose-built database engines, including relational, key-value, document, and in-memory databases. This purpose-built approach improves the way applications use data by providing better performance and reducing cost. However, the approach raises some challenges for data teams that need to provide a holistic view on top of these database engines, and especially […]
Secure connectivity patterns to access Amazon MSK across AWS Regions
AWS customers often segment their workloads across accounts and Amazon Virtual Private Cloud (Amazon VPC) to streamline access management while being able to expand their footprint. As a result, in some scenarios you, as an AWS customer, need to make an Amazon Managed Streaming for Apache Kafka (Amazon MSK) cluster accessible to Apache Kafka clients […]
Effective data lakes using AWS Lake Formation, Part 2: Securing data lakes with row-level access control
Increasingly, customers are looking at data lakes as a core part of their strategy to democratize data access across the organization. Data lakes enable you to handle petabytes and exabytes of data coming from a multitude of sources in varying formats, and gives users the ability to access it from their choice of analytics and […]
Benchmark the performance of the new Auto WLM with adaptive concurrency in Amazon Redshift
With Amazon Redshift, you can run a complex mix of workloads on your data warehouse clusters. For example, frequent data loads run alongside business-critical dashboard queries and complex transformation jobs. We also see more and more data science and machine learning (ML) workloads. Each workload type has different resource needs and different service level agreements. […]
How GE Healthcare modernized their data platform using a Lake House Architecture
GE Healthcare (GEHC) operates as a subsidiary of General Electric. The company is headquartered in the US and serves customers in over 160 countries. As a leading global medical technology, diagnostics, and digital solutions innovator, GE Healthcare enables clinicians to make faster, more informed decisions through intelligent devices, data analytics, applications, and services, supported by […]
Visualize data using Apache Spark running on Amazon EMR with Amazon QuickSight
Organizations often need to process large volumes of data before serving to business stakeholders. In this blog, we will learn how to leverage Amazon EMR to process data using Apache Spark, the go-to platform for in-memory analytics of large data volume, and connect business intelligence (BI) tool Amazon QuickSight to serve data to end-users. QuickSight […]
Build a centralized granular access control to manage assets and data access in Amazon QuickSight
A large business intelligence (BI) project with many users and teams and sensitive information demands a multi-faceted security architecture. Such architecture should provide BI administrators and architects with the capability to minimize the amount of information accessible to users. For a straightforward solution to manage Amazon QuickSight user and asset access permissions, you can use […]
Calculated fields, level-aware aggregations, and evaluation order in Amazon QuickSight
Amazon QuickSight is a fast, cloud-native, serverless, business intelligence service that makes it easy to deliver insights to everyone. QuickSight has carefully designed concepts and features that enable analysis builders, such as QuickSight authors, to design content-rich, interactive, and dynamic dashboards to share with dashboard viewers. As authors build an analysis, QuickSight transforms, filters, and […]
Access Amazon Location Service from Amazon Redshift
Organizations typically store business and customer data in databases like Amazon Relational Database Service (Amazon RDS) and Amazon Redshift, and often want to enrich this data by integrating with external services. One such enrichment is to add spatial attributes such as location coordinates for an address. With the introduction of Amazon Location Service, you now […]