AWS Big Data Blog

Tag: AWS Lambda

Data Lake Ingestion: Automatically Partition Hive External Tables with AWS

In this post, I introduce a simple data ingestion and preparation framework based on AWS Lambda, Amazon DynamoDB, and Apache Hive on EMR for data from different sources landing in S3. This solution lets Hive pick up new partitions as data is loaded into S3 because Hive by itself cannot detect new partitions as data lands.

Simplify Management of Amazon Redshift Snapshots using AWS Lambda

NOTE: Amazon Redshift now supports creating an automatic snapshot schedule using the snapshot scheduler. For more information, please review this “What’s New” post. ———————————- Ian Meyers is a Solutions Architecture Senior Manager with AWS Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data […]

Real-time in-memory OLTP and Analytics with Apache Ignite on AWS

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Babu Elumalai is a Solutions Architect with AWS Organizations are generating tremendous amounts of data, and they increasingly need tools and systems that help them use this data to make decisions. The […]

From SQL to Microservices: Integrating AWS Lambda with Relational Databases

Bob Strahan is a Senior Consultant with AWS Professional Services AWS Lambda has emerged as excellent compute platform for modern microservices architecture, driving dramatic advancements in flexibility, resilience, scale and cost effectiveness. Many customers can take advantage of this transformational technology from within their existing relational database applications. In this post, we explore how to […]

Analyze a Time Series in Real Time with AWS Lambda, Amazon Kinesis and Amazon DynamoDB Streams

This is a guest post by Richard Freeman, Ph.D., a solutions architect and data scientist at JustGiving. JustGiving in their own words: “We are one of the world’s largest social platforms for giving that’s helped 26.1 million registered users in 196 countries raise $3.8 billion for over 27,000 good causes.” Introduction As more devices, sensors […]

Building a Near Real-Time Discovery Platform with AWS

September 8, 2021: Amazon Elasticsearch Service has been renamed to Amazon OpenSearch Service. See details. February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Assaf Mentzer is a Senior Consultant for AWS Professional Services In the spirit of the U.S presidential […]

Persist Streaming Data to Amazon S3 using Amazon Data Firehose and AWS Lambda

February 9, 2024: Amazon Kinesis Data Firehose has been renamed to Amazon Data Firehose. Read the AWS What’s New post to learn more. Streaming data analytics is becoming main-stream (pun intended) in large enterprises as the technology stacks have become more user-friendly to implement. For example, Spark-Streaming connected to an Amazon Kinesis stream is a […]

Building and Maintaining an Amazon S3 Metadata Index without Servers

Mike Deck is a Solutions Architect with AWS Amazon S3 is a simple key-based object store whose scalability and low cost make it ideal for storing large datasets. Its design enables S3 to provide excellent performance for storing and retrieving objects based on a known key. Finding objects based on other attributes, however, requires doing […]