Month in Review (January 2016)
Lots for big data enthusiasts in January on the AWS Big Data Blog. Take a look!
Learn how to set up Zeppelin running “off-cluster” on a separate EC2 instance. You’ll be able to submit Spark jobs to an EMR cluster directly from your Zeppelin instance.
Work with key tools available in the Apache Spark application ecosystem for streaming analytics. This covers how features like Spark Streaming, Spark SQL, and HiveServer2 can work together on delivering a data stream as a temporary table that understands SQL queries.
What makes outstanding business intelligence (BI)? It needs to be accurate and up-to-date, but this alone won’t differentiate a solution. Perhaps a better measure is to consider the reaction you get when your latest report or metric is released to the business. Good BI excites. This post shows how your Amazon Redshift data warehouse can be agile.
Customers have used Campanile to migrate petabytes of data from one account to another, run periodic sync jobs and large Amazon Glacier restores, enable SSE, create indexes, and sync data before enabling CRR.
From the Archive (June 11, 2015):
Use Amazon Machine Learning and Amazon Redshift to predict the likelihood that a specific user will click on a specific ad.