In the healthcare field, data comes in all shapes and sizes. Despite efforts to standardize terminology, some concepts (e.g., blood glucose) are still often depicted in different ways. This post demonstrates how to convert an openly available dataset called MIMIC-III, which consists of de-identified medical data for about 40,000 patients, into an open source data […]
In this post, I discuss some of the practical challenges faced during the implementation of the data lake for Eliza and the corresponding details of the ways we solved these issues with AWS. The challenges we faced involved the variety of data and a need for a common view of the data.
In this post, I walk you through an architectural approach as well as a sample implementation on how to collect, process, and analyze data for event-driven applications in AWS.
In this post, I show you how to gain valuable insight into your network by using Amazon EMR and Amazon VPC Flow Logs. The walkthrough implements a pattern often found in network equipment called ‘Top Talkers’, an ordered list of the heaviest network users, but the model can also be used for many other types of network analysis.
In this post, I introduce a simple data ingestion and preparation framework based on AWS Lambda, Amazon DynamoDB, and Apache Hive on EMR for data from different sources landing in S3. This solution lets Hive pick up new partitions as data is loaded into S3 because Hive by itself cannot detect new partitions as data lands.
NOTE: Amazon Redshift now supports creating an automatic snapshot schedule using the snapshot scheduler. For more information, please review this “What’s New” post. ———————————- Ian Meyers is a Solutions Architecture Senior Manager with AWS Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data […]
Babu Elumalai is a Solutions Architect with AWS Organizations are generating tremendous amounts of data, and they increasingly need tools and systems that help them use this data to make decisions. The data has both immediate value (for example, trying to understand how a new promotion is performing in real time) and historic value (trying […]
Bob Strahan is a Senior Consultant with AWS Professional Services AWS Lambda has emerged as excellent compute platform for modern microservices architecture, driving dramatic advancements in flexibility, resilience, scale and cost effectiveness. Many customers can take advantage of this transformational technology from within their existing relational database applications. In this post, we explore how to […]
This is a guest post by Richard Freeman, Ph.D., a solutions architect and data scientist at JustGiving. JustGiving in their own words: “We are one of the world’s largest social platforms for giving that’s helped 26.1 million registered users in 196 countries raise $3.8 billion for over 27,000 good causes.” Introduction As more devices, sensors […]
September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. Assaf Mentzer is a Senior Consultant for AWS Professional Services In the spirit of the U.S presidential election of 2016, in this post I use Twitter public streams to analyze the candidates’ performance, both Republican and Democrat, in a near […]