AWS Big Data Blog

Join us This Week at Strata + Hadoop World in New York City

by Jorge A. Lopez | on | Permalink | Comments |  Share

We’re back in Manhattan for the Strata + Hadoop World conference, from Tuesday, September 27-29. Come see the AWS Big Data team at Booth #738, where big data experts will be happy to answer your questions, hear about your specific requirements, and help you with your big data initiatives. Catch a presentation Get technical details […]

Read More

Amazon EMR-DynamoDB Connector Repository on AWSLabs GitHub

Mike Grimes is a Software Development Engineer with Amazon EMR Amazon Web Services is excited to announce that the Amazon EMR-DynamoDB Connector is now open-source. The EMR-DynamoDB Connector is a set of libraries that lets you access data stored in DynamoDB with Spark, Hadoop MapReduce, and Hive jobs. These libraries are currently shipped with EMR […]

Read More

Encrypt Data At-Rest and In-Flight on Amazon EMR with Security Configurations

Customers running analytics, stream processing, machine learning, and ETL workloads on personally identifiable information, health information, and financial data have strict requirements for encryption of data at-rest and in-transit. The Apache Spark and Hadoop ecosystems lend themselves to these big data use cases, and customers have asked us to provide a quick and easy way […]

Read More

Real-time Clickstream Anomaly Detection with Amazon Kinesis Analytics

Chris Marshall is a Solutions Architect for Amazon Web Services Analyzing web log traffic to gain insights that drive business decisions has historically been performed using batch processing.  While effective, this approach results in delayed responses to emerging trends and user activities.  There are solutions to deal with processing data in real time using streaming […]

Read More

Writing SQL on Streaming Data with Amazon Kinesis Analytics – Part 2

Ryan Nienhuis is a Senior Product Manager for Amazon Kinesis. This is the second of two AWS Big Data posts on Writing SQL on Streaming Data with Amazon Kinesis Analytics. In the last post, I provided an overview of streaming data and key concepts, such as the basics of streaming SQL, and completed a walkthrough […]

Read More

Month in Review: August 2016

by Andy Werth | on | Permalink | Comments |  Share

Another month of big data solutions on the Big Data Blog. Take a look at our summaries below and learn, comment, and share. Thanks for reading! Readmission Prediction Through Patient Risk Stratification Using Amazon Machine Learning With this post, learn how to apply advanced analytics concepts like pattern analysis and machine learning to do risk […]

Read More

Integrating IoT Events into Your Analytic Platform

Veronika Megler, Ph.D., is a Senior Consultant with AWS Professional Services “We have a fleet of vehicles, with GPS and a bunch of other sensors,” said Bob, the VP at a delivery company. “Today they send their update ‘breadcrumbs’ to another IoT service. We’re planning to have them send their breadcrumbs to AWS IoT instead; […]

Read More

Processing VPC Flow Logs with Amazon EMR

Michael Wallman is a senior consultant with AWS ProServ It’s easy to understand network patterns in small AWS deployments where software stacks are well defined and managed. But as teams and usage grow, its gets harder to understand which systems communicate with each other, and on what ports. This often results in overly permissive security […]

Read More

Data Lake Ingestion: Automatically Partition Hive External Tables with AWS

Songzhi Liu is a Professional Services Consultant with AWS The data lake concept has become more and more popular among enterprise customers because it collects data from different sources and stores it where it can be easily combined, governed, and accessed. On the AWS cloud, Amazon S3 is a good candidate for a data lake […]

Read More

Seattle AWS Big Data Meetup: Building Smart Healthcare Applications on AWS

by Andy Werth | on | Permalink | Comments |  Share

Please join us at the upcoming Seattle AWS Big Data Meetup on Wednesday, August 31. The topic is “Building Smart Healthcare Apps on AWS,” with a spotlight on machine learning. Join now and get details on the Meetup page Lisa McFerrin, PhD, Bioinformatics is a Project Manager for Seattle Translational Tumor Research at Fred Hutchinson […]

Read More