Reviews from AWS Marketplace
0 AWS reviews
-
5 star0
-
4 star0
-
3 star0
-
2 star0
-
1 star0
External reviews
External reviews are not included in the AWS star rating for the product.
Big Data made easy with Cloudera
What do you like best about the product?
Cloudera abstracts you from the need of really knowing the depths of a Hadoop cluster at the beginning of your analytics stage. It's simply very easy to deploy servers with HDFS already deployed and connected that will give you automatic support to run Hive or Pig queries.
What do you dislike about the product?
Cloudera has very little things to not like it until it fails for some (usually) unknown reason. Crashes are not common but it's very annoying when it happens on a very long job. When we deal with distributed systems, however, failures are a very common thing so it lacks some better feedback of logs and error logs.
What problems is the product solving and how is that benefiting you?
If you want to deploy a relatively small cluster to execute batch processing that was taking days to hours, or some high speed queries that where taking hours to some minutes over a large set of data using a known SQL-like language like Hive is perfect.
Recommendations to others considering the product:
Cloudera and Hortonworks are the products to start with if you don't really know absolutely everything that is involved in a Hadoop cluster. Cloudera is more popular and have been around lot of time but Hortonworks is also a good option. Cloudera certifications are valuable in the industry (althought Hortonworks are cheaper). It depends on your focus, if your prefer some more known and used product to start in a Big Data cluster and pretend to be certified, try one of the virtual machines that Cloudera offers to start playing with. If you simply want to learn a bit of "Big Data" things for yourself, maybe I'll give a try to Hortonworks as all it's architecture is also open to read and learn.
- Leave a Comment |
- Mark review as helpful
I love Cloudera
What do you like best about the product?
Cloudera is the LEADING managed Hadoop platform. Previous organizations I've worked at have used Hortonworks or Pivotal, but Cloudera emerged as the leading stack. Their platform offers everything – Hadoop, HDFS, YARN, MapReduce, Pig, Hive, Hbase, Oozie, Impala, Spark, Sqoop, and so much more. Their release cycle is pretty quick, and their developers and specialists are experts and contributors to the Apache open source projects. I love Cloudera. Take a look at their blog!
What do you dislike about the product?
The release cycle for CDH is a bit slow, but it's a trade off for stability and security. For example, Spark runs out of the box as an app on YARN. But Spark's rapid release cycle fell into Cloudera's longer cycle, and we wound up with Spark 1.3 for months as 1.4 and 1.5 were released.
The training is a bit repetitive & shallow at times, and could go more in depth, especially with so many talented developers on staff.
The training is a bit repetitive & shallow at times, and could go more in depth, especially with so many talented developers on staff.
What problems is the product solving and how is that benefiting you?
We have a couple of data lakes that we migrated and built out on Cloudera. Primarily business intelligence and analytics.
Clouder Easy Install
What do you like best about the product?
Easy to install, simple to configure. Dashboard and API Dashboard ease of use. A lot of services
What do you dislike about the product?
Sometimes manager host needs a lot of RAM. Cloudera client sometimes not responsive
What problems is the product solving and how is that benefiting you?
Hadoop management made easy
Recommendations to others considering the product:
yes, its better than other vendor IMHO
Cloudera is a great hadoop environment
What do you like best about the product?
Ease of use and setup. You are easily able to diagnose problems with the cluster through the GUI. Spark integration as well as Hbase is great for our needs. Kafka integration has helped us test a new feature in application thereby increasing performance. All the metrics related to the environment really gives us an idea about our clusters health, thereby reducing surprises.
What do you dislike about the product?
Couple of small setup that are very integrated to our old system were hard to figure out. A little bit more documentation is needed. SparkSQL is not fully supported and there is no way for us to upgrade an individual component our-self. The change on location of libraries from the given virtualbox image to the production environment caused small issues. It might be better if the VM was able to replicate the production environment as close as possible.
What problems is the product solving and how is that benefiting you?
Storing large amounts of data and processing it in reasonable time frame. Being able to use our old code base with small changes to library as opposed to rewriting our entire code. Option of having mapreduce or YARN is great as our code does not work with YARN. Installing cloudera 5.4 reduces our time to deployment from 4 days to 1 which is great.
Recommendations to others considering the product:
Storm integration, Support for SparkSQL and its newer components. Allowing users to upgrade individual components to match with opensource release. It may not be compatible but will give us users a chance to fix/learn in the meantime.
showing 31 - 34