Read-replica support for Apache HBase on S3 and Apache Flink 1.3.0 in Amazon EMR release 5.7.0

Posted on: Jul 14, 2017

You can now create read-replica Apache HBase clusters pointed to the same underlying HBase tables in Amazon S3 on Amazon EMR release 5.7.0. Apache HBase is a distributed, non-relational database built for random, strictly consistent realtime access for tables with billions of rows and millions of columns. By using read-replicas, you can increase availability by creating HBase clusters in different Amazon EC2 Availability Zones that read from the same dataset in Amazon S3.

Additionally, you can now use new versions of Apache Flink (1.3.0), Apache Zeppelin (0.7.2), and Apache Phoenix (4.11.0) on Amazon EMR release 5.7.0. Flink 1.3.0 adds new features including enhancements to state handling and recovery, extended support for aggregations in the Table API & SQL, and improvements to the Amazon Kinesis consumer and ElasticSearch connector. Zeppelin 0.7.2 and Phoenix 4.11.0 contain bug fixes and minor improvements.

You can create an Amazon EMR cluster with release 5.7.0 by choosing release label “emr-5.7.0” from the AWS Management Console, AWS CLI, or SDK. To configure HBase as a read-replica, enable read-replica in the HBase configuration and specify the HBase root directory from your read/write cluster in Amazon S3. Please visit the Amazon EMR documentation for more information about release 5.7.0, Flink 1.3.0, Zeppelin 0.7.2, Phoenix 4.11.0, and HBase on S3 read-replicas.

Amazon EMR release 5.7.0 is available in all supported regions for Amazon EMR.