AWS Big Data Blog

Category: Analytics

The following diagram illustrates the architecture of these multi-tenant storage strategies.

Implementing multi-tenant patterns in Amazon Redshift using data sharing

Software service providers offer subscription-based analytics capabilities in the cloud with Analytics as a Service (AaaS), and increasingly customers are turning to AaaS for business insights. A multi-tenant storage strategy allows the service providers to build a cost-effective architecture to meet increasing demand. Multi-tenancy means a single instance of software and its supporting infrastructure is […]

The following diagram shows the solution architecture for the Vertica custom connector when deployed to AWS.

Querying a Vertica data source in Amazon Athena using the Athena Federated Query SDK

The ability to query data and perform ad hoc analysis across multiple platforms and data stores with a single tool brings immense value to the big data analytical arena. As organizations build out data lakes with increasing volumes of data, there is a growing need to combine that data with large amounts of data in […]

In the following tree diagram, we’ve outlined what the bucket path may look like as logs are delivered to your S3 bucket

Automating AWS service logs table creation and querying them with Amazon Athena

I was working with a customer who was just getting started using AWS, and they wanted to understand how to query their AWS service logs that were being delivered to Amazon Simple Storage Service (Amazon S3). I introduced them to Amazon Athena, a serverless, interactive query service that allows you to easily analyze data in […]

Following a remote planning phase in which we defined our requirements and laid out the basic design.

How Baqend built a real-time web analytics platform using Amazon Kinesis Data Analytics for Apache Flink

August 30, 2023: Amazon Kinesis Data Analytics has been renamed to Amazon Managed Service for Apache Flink. Read the announcement in the AWS News Blog and learn more. This is a customer post written by the engineers from German startup Baqend and the AWS EMEA Prototyping Labs team. Baqend is one of the fastest-growing software […]

The following diagram shows the architecture EMX uses.

How EMX reduced data pipeline costs by 85% with Amazon Athena

This is a guest blog post by Gary Bouton and Louis Ashner from EMX. In their own words, “ENGINE Media Exchange (EMX) is a leading marketing technology company, leveraging a patented, end-to-end tech stack purpose-built to meet the demands of today’s digital marketplace. The company creates both open- and closed-loop solutions designed to unify advertisers, […]

Detecting anomalous values by invoking the Amazon Athena machine learning inference function

Amazon Athena has released a new feature that allows you to easily invoke machine learning (ML) models for inference directly from your SQL queries. Inference is the stage in which a trained model is used to infer and predict the testing samples and comprises a similar forward pass as training to predict the values. Unlike […]

The following diagram illustrates the workflow.

Orchestrating analytics jobs on Amazon EMR Notebooks using Amazon MWAA

May 2024: This post was reviewed and updated with a new dataset. In a previous post, we introduced the Amazon EMR notebook APIs, which allow you to programmatically run a notebook on Amazon EMR Studio (preview) without accessing the AWS web console. With the APIs, you can schedule running EMR notebooks with cron scripts, chain multiple notebooks, […]

The following diagram illustrates this architecture.

How Edmunds GovTech unifies data and analytics data for municipalities with Amazon QuickSight

This is a guest post from an Amazon QuickSight customer, Edmunds GovTech Over the past 30 years, Edmunds GovTech has grown to provide enterprise resource planning (ERP) solutions to thousands of East Coast municipalities. We also serve cities and towns in 25 other states. In this blog, I’ll talk about how we used Amazon QuickSight […]

Data monetization and customer experience optimization using telco data assets: Part 2

Part 1 of this series explains the importance of building and implementing a customer experience (CX) management and data monetization strategy for telecom service providers (TSPs), and the major challenges driving these initiatives. It also includes an AWS CloudFormation template to set up a demonstration of the solution using AWS services. It covers transforming and enriching […]