AWS Big Data Blog

Category: Analytics

Build a Concurrent Data Orchestration Pipeline Using Amazon EMR and Apache Livy

In this post, we explore orchestrating a Spark data pipeline on Amazon EMR using Apache Livy and Apache Airflow, we create a simple Airflow DAG to demonstrate how to run spark jobs concurrently, and we see how Livy helps to hide the complexity to submit spark jobs via REST by using optimal EMR resources.

Read More

Encrypt data in transit using a TLS custom certificate provider with Amazon EMR

Many enterprises have highly regulated policies around cloud security. Those policies might be even more restrictive for Amazon EMR where sensitive data is processed. EMR provides security configurations that allow you to set up encryption for data at rest stored on Amazon S3 and local Amazon EBS volumes. It also allows the setup of Transport […]

Read More

Analyze Amazon Connect records with Amazon Athena, AWS Glue, and Amazon QuickSight

In this blog post, we focus on how to get analytics out of the rich set of data published by Amazon Connect. We make use of an Amazon Connect data stream and create an end-to-end workflow to offer an analytical solution that can be customized based on need.

Read More