AWS Big Data Blog
Implement Serverless Log Analytics Using Amazon Kinesis Analytics
Applications log a large amount of data that—when analyzed in real time—provides significant insight into your applications. Real-time log analysis can be used to ensure security compliance, troubleshoot operation events, identify application usage patterns, and much more. Ingesting and analyzing this data in real time can be accomplished by using a variety of open source […]
Read MoreMonth in Review: January 2017
Another month of big data solutions on the Big Data Blog! Take a look at our summaries below and learn, comment, and share. Thank you for reading! NEW POSTS Decreasing Game Churn: How Upopa used ironSource Atom and Amazon ML to Engage Users Ever wondered what it takes to keep a user from leaving your […]
Read MoreSecure Amazon EMR with Encryption
In the last few years, there has been a rapid rise in enterprises adopting the Apache Hadoop ecosystem for critical workloads that process sensitive or highly confidential data. Due to the highly critical nature of the workloads, the enterprises implement certain organization/industry wide policies and certain regulatory or compliance policies. Such policy requirements are designed […]
Read MoreRun Mixed Workloads with Amazon Redshift Workload Management
This blog post has been translated into Japanese. Mixed workloads run batch and interactive workloads (short-running and long-running queries or reports) concurrently to support business needs or demand. Typically, managing and configuring mixed workloads requires a thorough understanding of access patterns, how the system resources are being used and performance requirements. It’s common for mixed […]
Read MoreConverging Data Silos to Amazon Redshift Using AWS DMS
Organizations often grow organically—and so does their data in individual silos. Such systems are often powered by traditional RDBMS systems and they grow orthogonally in size and features. To gain intelligence across heterogeneous data sources, you have to join the data sets. However, this imposes new challenges, as joining data over dblinks or into a […]
Read MoreCall for Papers! DEEM: 1st Workshop on Data Management for End-to-End Machine Learning
Amazon and Matroid will hold the first workshop on Data Management for End-to-End Machine Learning (DEEM) on May 14th, 2017 in conjunction with the premier systems conference SIGMOD/PODS 2017 in Raleigh, North Carolina. For more details about the workshop focus, see Challenges and opportunities in machine learning below. DEEM brings together researchers and practitioners at […]
Read MoreCreate a Healthcare Data Hub with AWS and Mirth Connect
As anyone visiting their doctor may have noticed, gone are the days of physicians recording their notes on paper. Physicians are more likely to enter the exam room with a laptop than with paper and pen. This change is the byproduct of efforts to improve patient outcomes, increase efficiency, and drive population health. Pushing for […]
Read MoreDecreasing Game Churn: How Upopa used ironSource Atom and Amazon ML to Engage Users
This is a guest post by Tom Talpir, Software Developer at ironSource. ironSource is as an Advanced AWS Partner Network (APN) Technology Partner and an AWS Big Data Competency Partner. Ever wondered what it takes to keep a user from leaving your game or application after all the hard work you put in? Wouldn’t it be great […]
Read MoreMonth in Review: December 2016
Another month of big data solutions on the Big Data Blog. Take a look at our summaries below and learn, comment, and share. Thank you for reading! Implementing Authorization and Auditing using Apache Ranger on Amazon EMR Apache Ranger is a framework to enable, monitor, and manage comprehensive data security across the Hadoop platform. Features […]
Read MorePowering Amazon Redshift Analytics with Apache Spark and Amazon Machine Learning
Air travel can be stressful due to the many factors that are simply out of airline passengers’ control. As passengers, we want to minimize this stress as much as we can. We can do this by using past data to make predictions about how likely a flight will be delayed based on the time of […]
Read More