AWS Big Data Blog
Category: AWS Big Data
Will Spark Power the Data behind Precision Medicine?
Christopher Crosbie is a Healthcare and Life Science Solutions Architect with Amazon Web Services. This post was co-authored by Ujjwal Ratan, a Solutions Architect with Amazon Web Services. ——————————— “And that’s the promise of precision medicine — delivering the right treatments, at the right time, every time to the right person.“ (President Obama, 2015 State […]
Crunching Statistics at Scale with SparkR on Amazon EMR
Christopher Crosbie is a Healthcare and Life Science Solutions Architect with Amazon Web Services. This post is co-authored by Gopal Wunnava, a Senior Consultant with AWS Professional Services. SparkR is an R package that allows you to integrate complex statistical analysis with large datasets. In this blog post, we introduce you running R with the […]
AWS Big Data Meetup March 31 in San Francisco: Intro to SparkR and breakout discussions
Join and RSVP! Guest Speaker: Cory Dolphin from Twitter Learn about how Answers, Fabric’s realtime analytics product, which processes billions of events in realtime, using Twitter’s new stream processing engine, Heron. Cory will explain some of the challenges the team faced while scaling Storm, and how Heron has helped them fly faster. Specifically, Cory will describe how Heron’s […]
Anomaly Detection Using PySpark, Hive, and Hue on Amazon EMR
Veronika Megler, Ph.D., is a Senior Consultant with AWS Professional Services We are surrounded by more and more sensors – some of which we’re not even consciously aware. As sensors become cheaper and easier to connect, they create an increasing flood of data that’s getting cheaper and easier to store and process. However, sensor readings […]
Import Zeppelin notes from GitHub or JSON in Zeppelin 0.5.6 on Amazon EMR
Jonathan Fritz is a Senior Product Manager for Amazon EMR Many Amazon EMR customers use Zeppelin to create interactive notebooks to run workloads with Spark using Scala, Python, and SQL. These customers have found Amazon EMR to be a great platform for running Zeppelin because of strong integration with other AWS services and the ability […]
AWS Big Data Meetup March 22 in Seattle: Intro to SparkR and breakout discussions
Join and RSVP! AWS Speaker Christopher Crosbie, Healthcare and Life Sciences Partner Solutions Architect for Amazon Web Services For a long time, R users have sliced and diced their computational problems into smaller pieces to be able to run it in smaller chunks. But what if you want to compute on a huge dataframe with […]
Analyze a Time Series in Real Time with AWS Lambda, Amazon Kinesis and Amazon DynamoDB Streams
This is a guest post by Richard Freeman, Ph.D., a solutions architect and data scientist at JustGiving. JustGiving in their own words: “We are one of the world’s largest social platforms for giving that’s helped 26.1 million registered users in 196 countries raise $3.8 billion for over 27,000 good causes.” Introduction As more devices, sensors […]
AWS Partner Post Spotlight: Attunity
Partners are a vital part of the AWS ecosystem, and AWS Partners have made important contributions to the AWS Big Data Blog. This month’s Partner Post Spotlight is on Attunity, who co-authored the post “Using Attunity CloudBeam at UMUC to Replicate Data to Amazon RDS and Amazon Redshift.” Their post explains how UMUC used Attunity […]
Big Data Website Gets a Big Makeover at AWS
Jorge A. Lopez is responsible for Big Data Solutions Marketing at AWS The big data ecosystem is evolving at a tremendous pace, giving rise to a plethora of tools, use cases, and applications. The new AWS Big Data website is now the ideal starting point to learn about new and existing capabilities, and the services […]
Analyze Your Data on Amazon DynamoDB with Apache Spark
Manjeet Chayel is a Solutions Architect with AWS Every day, tons of customer data is generated, such as website logs, gaming data, advertising data, and streaming videos. Many companies capture this information as it’s generated and process it in real time to understand their customers. Amazon DynamoDB is a fast and flexible NoSQL database service […]