AWS Big Data Blog

Real-time in-memory OLTP and Analytics with Apache Ignite on AWS

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Babu Elumalai is a Solutions Architect with AWS Organizations are generating tremendous amounts of data, and they increasingly need tools and systems that help them use this data to make decisions. The […]

From SQL to Microservices: Integrating AWS Lambda with Relational Databases

Bob Strahan is a Senior Consultant with AWS Professional Services AWS Lambda has emerged as excellent compute platform for modern microservices architecture, driving dramatic advancements in flexibility, resilience, scale and cost effectiveness. Many customers can take advantage of this transformational technology from within their existing relational database applications. In this post, we explore how to […]

Month in Review: April 2016

Lots to see on the Big Data Blog in April! Please take a look at the summaries below for something that catches your interest. Exploring Geospatial Intelligence using SparkR on Amazon EMR The number of data sources that use location, such as smartphones and sensory devices used in IoT (Internet of things), is expanding rapidly. […]

Sharpen your Skill Set with Apache Spark on the AWS Big Data Blog

The AWS Big Data Blog has a large community of authors who are passionate about Apache Spark and who regularly publish content that helps customers use Spark to build real-world solutions. You’ll see content on a variety of topics, including deep-dives on Spark’s internals, building Spark Streaming applications, creating machine learning pipelines using MLlib, and ways […]

Combine NoSQL and Massively Parallel Analytics Using Apache HBase and Apache Hive on Amazon EMR

Ben Snively is a Solutions Architect with AWS Jon Fritz, a Senior Product Manager for Amazon EMR, co-authored this post With today’s launch of Amazon EMR release 4.6, you can now quickly and easily provision a cluster with Apache HBase 1.2. Apache HBase is a massively scalable, distributed big data store in the Apache Hadoop ecosystem. It is […]

AWS at Strata+Hadoop 2016: Building a Scalable Architecture on AWS to Process Streaming Data

Gone are the days when big data was confined to batch processing.  To remain competitive, companies must be able to analyze real-time data streams in areas such as video streaming, real-time recommendation engines, preventive maintenance, and fraud detection applications. Last month, Siva Raghupathy and Manjeet Chayel presented “Building a scalable architecture for processing streaming data […]