AWS Big Data Blog

Month in Review: July 2016

by Derek Young | on | Permalink | Comments |  Share

July was a busy month of big data solutions on the Big Data Blog. The month started with our most popular story yet, Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE. It was a great post to start a spectacular month. Take a look at our summaries below. Learn, comment, and share. […]

Read More

Use Spark 2.0, Hive 2.1 on Tez, and the latest from the Hadoop ecosystem on Amazon EMR release 5.0

Jonathan Fritz is a Senior Product Manager for Amazon EMR We are excited to launch Amazon EMR release 5.0 today, giving customers the latest versions of 16 supported open-source applications in the big data ecosystem, including new major versions of Spark and Hive. Almost exactly a year ago, we shipped release 4.0, which brought significant […]

Read More

Learn about Amazon Redshift in our new Data Warehousing on AWS Class

by Janna Pellegrino | on | Permalink | Comments |  Share

As our customers look to use their data to drive their missions forward, finding a way to simply and cost-effectively use analytics is increasingly important. Training and Certification now offers Data Warehousing on AWS, a new training course to help you learn how to leverage the AWS Cloud as a platform for data warehousing solutions. […]

Read More

Process Large DynamoDB Streams Using Multiple Amazon Kinesis Client Library (KCL) Workers

Asmita Barve-Karandikar is an SDE with DynamoDB Introduction Imagine you own a popular mobile health app, with millions of users worldwide, that continuously records new information. It sends over one million updates per second to its master data store and needs the updates to be relayed to various replicas across different regions in real time. […]

Read More

AWS re:Invent 2016 Registration is Now Open

by Andy Werth | on | Permalink | Comments |  Share

Register now for the fifth annual AWS re:Invent, the largest gathering of the global cloud computing community. Join us in Las Vegas for opportunities to connect, collaborate, and learn about AWS solutions. There will be many opportunities for developers and data scientists working in big data to sharpen their skills and learn what’s coming next […]

Read More

Simplify Management of Amazon Redshift Snapshots using AWS Lambda

Ian Meyers is a Solutions Architecture Senior Manager with AWS Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools. A cluster is automatically backed up to Amazon S3 by default, and three automatic snapshots of the cluster […]

Read More