AWS Big Data Blog

Ingest streaming data into Amazon Elasticsearch Service within the privacy of your VPC with Amazon Kinesis Data Firehose

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. Today we are adding a new Amazon Kinesis Data Firehose feature to set up VPC delivery to your Amazon OpenSearch Service domain from the Kinesis Data Firehose. If you have been managing a custom application on Amazon Kinesis Data Streams […]

Achieve finer-grained data security with column-level access control in Amazon Redshift

Amazon Redshift is the most popular cloud data warehouse because it provides fast insights at a low cost. Customers can confidently run mission critical workloads, even in highly regulated industries, because Amazon Redshift comes with out of the box security and compliance. The security features, combined with the ability to easily analyze data in-place and […]

Speed up your ELT and BI queries with Amazon Redshift materialized views

The Amazon Redshift materialized views function helps you achieve significantly faster query performance on repeated or predictable workloads such as dashboard queries from Business Intelligence (BI) tools, such as Amazon QuickSight. It also speeds up and simplifies extract, load, and transform (ELT) data processing. You can use materialized views to store frequently used precomputations and […]

Query, visualize, and forecast TruFactor web session intelligence with AWS Data Exchange

This post showcases TruFactor Intelligence-as-a-Service data on AWS Data Exchange. TruFactor’s anonymization platform and proprietary AI ingests, filters, and transforms more than 85 billion high-quality raw signals daily from wireless carriers, OEMs, and mobile apps into a unified phygital consumer graph across physical and digital dimensions. TruFactor intelligence is application-ready for use within any AWS analytics or ML service to power your models and applications running on AWS, with no additional processing required.

Build a Simplified ETL and Live Data Query Solution using Redshift Federated Query

You may have heard the saying that the best ETL is no ETL. Amazon Redshift now makes this possible with Federated Query. In its initial release, this feature lets you query data in Amazon Aurora PostgreSQL or Amazon RDS for PostgreSQL using Amazon Redshift external schemas. Federated Query also exposes the metadata from these source databases through system views and driver APIs, which allows business intelligence tools like Tableau and Amazon Quicksight to connect to Amazon Redshift and query data in PostgreSQL without having to make local copies.

Build a cloud-native network performance analytics solution on AWS for wireless service providers

This post demonstrates a serverless, cloud-based approach to building a network performance analytics solution using AWS services that can provide flexibility and performance while keeping costs under control with pay-per-use AWS services. Without good network performance, you may struggle to face the challenges of real-time and low latency services and the increase of the total […]

A public data lake for analysis of COVID-19 data

As the COVID-19 pandemic continues to threaten and take lives around the world, we must work together across organizations and scientific disciplines to fight this disease. Innumerable healthcare workers, medical researchers, scientists, and public health officials are already on the front lines caring for patients, searching for therapies, educating the public, and helping to set […]

Simplify your Spark dependency management with Docker in EMR 6.0.0

Apache Spark is a powerful data processing engine that gives data analyst and engineering teams easy to use APIs and tools to analyze their data, but it can be challenging for teams to manage their Python and R library dependencies. Installing every dependency that a job may need before it runs and dealing with library […]