AWS Big Data Blog

Tag: Zeppelin

Analyze Realtime Data from Amazon Kinesis Streams Using Zeppelin and Spark Streaming

Manjeet Chayel is a Solutions Architect with AWS Streaming data is everywhere. This includes clickstream data, data from sensors, data emitted from billions of IoT devices, and more. Not surprisingly, data scientists want to analyze and explore these data streams in real time. This post shows you how you can use Spark Streaming to process […]

Read More

Running an External Zeppelin Instance using S3 Backed Notebooks with Spark on Amazon EMR

Dominic Murphy is an Enterprise Solution Architect with Amazon Web Services Apache Zeppelin is an open source GUI which creates interactive and collaborative notebooks for data exploration using Spark. You can use Scala, Python, SQL (using Spark SQL), or HiveQL to manipulate data and quickly visualize results. Zeppelin notebooks can be shared among several users, […]

Read More

Building a Recommendation Engine with Spark ML on Amazon EMR using Zeppelin

Guy Ernest is a Solutions Architect with AWS Many developers want to implement the famous Amazon model that was used to power the “People who bought this also bought these items” feature on This model is based on a method called Collaborative Filtering. It takes items such as movies, books, and products that were […]

Read More