AWS Big Data Blog

Monitor Your Application for Processing DynamoDB Streams

Asmita Barve-Karandikar is an SDE with DynamoDB DynamoDB Streams can handle requests at scale, but you risk losing stream records if your processing application lags: DynamoDB Stream records are unavailable after 24 hours. Therefore, when you maintain multiregion read replicas of your DynamoDB table, you might be afraid of losing data. In this post, I […]

Read More

Writing SQL on Streaming Data with Amazon Kinesis Analytics – Part 1

Ryan Nienhuis is a Senior Product Manager for Amazon Kinesis This is the first of two AWS Big Data blog posts on Writing SQL on Streaming Data with Amazon Kinesis Analytics. In this post, I provide an overview of streaming data and key concepts like the basics of streaming SQL, and complete a walkthrough using […]

Read More

Building and Deploying Custom Applications with Apache Bigtop and Amazon EMR

Hernan Vivani is an Hadoop Systems Engineer for Amazon Web Services When you launch a cluster, Amazon EMR lets you choose applications that will run on your cluster. But what if you want to deploy your own custom application? This post shows you how to build a custom application for EMR for Apache Bigtop-based releases 4.x and greater. EMR […]

Read More

Readmission Prediction Through Patient Risk Stratification Using Amazon Machine Learning

Ujjwal Ratan is a Solutions Architect with Amazon Web Services The Hospital Readmission Reduction Program (HRRP) was included as part of the Affordable Care Act to improve quality of care and lower healthcare spending. A patient visit to a hospital may be constituted as a readmission if the patient in question is admitted to a […]

Read More

Month in Review: July 2016

by Derek Young | on | Permalink | Comments |  Share

July was a busy month of big data solutions on the Big Data Blog. The month started with our most popular story yet, Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE. It was a great post to start a spectacular month. Take a look at our summaries below. Learn, comment, and share. […]

Read More

Use Spark 2.0, Hive 2.1 on Tez, and the latest from the Hadoop ecosystem on Amazon EMR release 5.0

Jonathan Fritz is a Senior Product Manager for Amazon EMR We are excited to launch Amazon EMR release 5.0 today, giving customers the latest versions of 16 supported open-source applications in the big data ecosystem, including new major versions of Spark and Hive. Almost exactly a year ago, we shipped release 4.0, which brought significant […]

Read More

Installing and Running JobServer for Apache Spark on Amazon EMR

Derek Graeber is a senior consultant in big data analytics for AWS Professional Services Working with customers who are running Apache Spark on Amazon EMR, I run into the scenario where data loaded into a SparkContext can and should be shared across multiple use cases. They ask a very valid question: “Once I load the […]

Read More

Learn about Amazon Redshift in our new Data Warehousing on AWS Class

by Janna Pellegrino | on | Permalink | Comments |  Share

As our customers look to use their data to drive their missions forward, finding a way to simply and cost-effectively use analytics is increasingly important. Training and Certification now offers Data Warehousing on AWS, a new training course to help you learn how to leverage the AWS Cloud as a platform for data warehousing solutions. […]

Read More

Process Large DynamoDB Streams Using Multiple Amazon Kinesis Client Library (KCL) Workers

Asmita Barve-Karandikar is an SDE with DynamoDB Introduction Imagine you own a popular mobile health app, with millions of users worldwide, that continuously records new information. It sends over one million updates per second to its master data store and needs the updates to be relayed to various replicas across different regions in real time. […]

Read More

AWS re:Invent 2016 Registration is Now Open

by Andy Werth | on | Permalink | Comments |  Share

Register now for the fifth annual AWS re:Invent, the largest gathering of the global cloud computing community. Join us in Las Vegas for opportunities to connect, collaborate, and learn about AWS solutions. There will be many opportunities for developers and data scientists working in big data to sharpen their skills and learn what’s coming next […]

Read More