Uncategorized | AWS Big Data Blog

Month in Review: February 2016

Lots for big data enthusiasts in February on the AWS Big Data Blog. Take a look! Submitting User Applications with spark-submit Learn how to set spark-submit flags to control the memory and compute resources available to your application submitted to Spark running on EMR. Learn when to use the maximizeResourceAllocation configuration option and dynamic allocation […]

Join us at the AWS Big Data Meetup on February 24th in Palo Alto

Join and RSVP! Guest Speaker: Cory Dolphin from Twitter Learn about how Answers, Fabric’s realtime analytics product, which processes billions of events in realtime, using Twitter’s new stream processing engine, Heron. Cory will explain some of the challenges the team faced while scaling Storm, and how Heron has helped them fly faster. Specifically, Cory will describe how Heron’s […]

Month in Review (January 2016)

Lots for big data enthusiasts in January on the AWS Big Data Blog. Take a look! Running an External Zeppelin Instance using S3 Backed Notebooks with Spark on Amazon EMR Learn how to set up Zeppelin running “off-cluster” on a separate EC2 instance. You’ll be able to submit Spark jobs to an EMR cluster directly […]

Join us at the AWS Big Data Meetup on January 13th in San Francisco

The AWS Big Data Meetup brings Big Data developers and enthusiasts together to discuss Big Data solutions with each other and AWS team members. At the event you will hear speakers from AWS and the wider community who are pushing the boundaries of Big Data. We are committed to maintaining a technical focus, and invite […]

Month in Review: December 2015

Lots for big data enthusiasts in December on the AWS Big Data Blog. Take a look! Top 10 Performance Tuning Techniques for Amazon Redshift “This post takes you through the most common issues that customers find as they adopt Amazon Redshift, and gives you concrete guidance on how to address each.” Migrating Metadata when Encrypting […]

Big Data AWS Training Course Gets Big Update

Michael Stroh is Communications Manager for AWS Training & Certification AWS offers a number of in-depth technical training courses, which we’re regularly updating in response to student feedback and changes to the AWS platform. Today I want to tell you about some exciting changes to Big Data on AWS, our most comprehensive training course on […]

Videos now available for AWS re:Invent 2015 Big Data Analytics sessions

For those of you who were able to attend AWS re:Invent 2015 last week or watched sessions through our live stream, thanks for participating in the conference. We hope you left feeling inspired to tackle your big data projects with tools in the AWS ecosystem and partner solutions. Also, we were excited for our customers […]

AWS Big Data Analytics Sessions at re:Invent 2015

Roy Ben-Alta is a Business Development Manager – Big Data & Analytics If you will be attending re:Invent 2015 in Las Vegas next week, you know that you’ll have many opportunities to learn more about Big Data & Analytics on AWS at the conference–and this year we have over 20 sessions! The following breakout sessions compose this […]

Building and Running a Recommendation Engine at Any Scale

This is a guest post by K Young, co-founder and CEO of Mortar Data. Mortar Data is an AWS advanced technology partner. UPDATE: MortarData has transitioned into Datadog and has wound down the public Mortar service. The tutorial below no longer works. To learn more about building a recommendation engine on AWS, see Building a […]

The Impact of Using Latest-Generation Instances for Your Amazon EMR Job

Nick Corbett is a Big Data Consultant for AWS Professional Services Amazon Elastic MapReduce (Amazon EMR) is a web service that makes it easy to process large amounts of data efficiently. Amazon EMR uses the popular open source framework Apache Hadoop combined with several other AWS products to do such tasks as web indexing, data […]

AWS Big Data Blog

Category: Uncategorized