Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Cloudera on AWS

Cloudera

Reviews from AWS customer

2 AWS reviews

External reviews

46 reviews
from and

External reviews are not included in the AWS star rating for the product.


    reviewer2776239

Uses handwritten notes and voice files to perform text analytics and gain real-time insights

  • November 12, 2025
  • Review provided by PeerSpot

What is our primary use case?

My main use case for Cloudera Data Platform is dealing with large volumes of data and primarily handling unstructured data by combining structured and unstructured data on this platform.

I use Cloudera Data Platform for handling unstructured data primarily in a healthcare company where there are many research notes, which are handwritten notes. Using this platform, we have performed PDF extraction where we store PDF data and then extract the data by performing PDF extraction using this platform. That is one use case. The second use case is mainly dealing with voice files. We store the voice files, convert voice to text, and then perform text analytics on that. It is basically dealing with call center voice files.

How has it helped my organization?

Cloudera Data Platform has impacted my organization positively in many ways. I belong to the service industry, and many of my customers are using this platform. They are predominantly using Cloudera Data Platform mainly from the banking domain.

It has made things better for those banking customers by providing all of the above.

What is most valuable?

The best features Cloudera Data Platform offers are from the earlier version, and if you see the latest version, there is significant change. It is very much end-user friendly. There are many user interfaces that they have added. A single pane for administration is easy from a data engineering perspective. You can use drag and drop more in the UI features; they are providing good dashboards to understand the performance of your platform. Ready metrics are available. It is very easy administration from a data platform standpoint. There are many other areas such as data principles including lineage and data security, all of which are really coming out of the box of this platform.

The dashboards and drag-and-drop tools have helped my team because the metrics are already available. As an administrator of the platform, certain key metrics are already available as a dropdown. You can select and pick whichever you want, and based on that, you will be able to see memory utilization and disk utilization. Based on that, you can make a decision such as whether you need to do some performance tweaks or add more hardware to your clusters. Those sorts of insights and early alerts help you to do that. That is also another feature available within the platform. From the administration perspective, it is really helpful for the data administrator or a platform administrator.

What needs improvement?

Cloudera Data Platform can be improved in several areas. I recently attended their roadmap session. Whatever limitations they have identified involve moving data from on-premises to cloud as a single-pane view and better lineage. They have done some recent acquisitions as well to overcome their product limitations. They are on the right track by doing this analysis themselves, identifying what the weaknesses are, and then using mergers or acquisitions to overcome them.

I would like to add that, beyond the platform itself, they should provide more training to systems integrators so that they can have a more ready workforce to use Cloudera Data Platform.

For how long have I used the solution?

I have been using Cloudera Data Platform for almost ten years.

What do I think about the stability of the solution?

Cloudera Data Platform is pretty stable in my experience; there are not any downtime or reliability issues.

In large environments or with growing data needs, I have seen hundred-node clusters running fine, dealing with petabytes of data. I have not seen any issues. When we go for node addition or node rebalancing, there are sometimes issues usually dealt with. It is not a major issue per se; it is more about how you deal with that particular situation.

What do I think about the scalability of the solution?

I manage scalability with Cloudera Data Platform, and the current features available are better now. They have the cloud burst feature available where if the on-premises capacity is not sufficient at a point in time, you can run that Spark job on the cloud itself. The cloud burst feature which they have recently added allows better scalability from a perspective to utilize a better ecosystem provider as well.

How are customer service and support?

My experience with customer support for Cloudera Data Platform is good. I have not majorly dealt with them, but whatever I have heard from my various team members indicates that customer support is good. They provide good pre-sale support and overall handholding to identify the right use case and technologies. Overall, they provide good support from the company.

Customer support is responsive and knowledgeable, but since I have not actually dealt with them extensively, I will not be able to provide a scale on one to ten.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

I did not use a different solution before Cloudera Data Platform; we used to use only structured databases for our data warehousing solution. It is a move from only structured data or on-premises appliance-based solutions to Cloudera Data Platform.

What was our ROI?

I have seen a return on investment. There are licensing costs that have been saved when we moved some of the data platforms, decommissioned them, and moved on to this platform. Time has been saved by implementing the right data quality solution so that the team used to spend more time correcting data. The right data quality solution saves time. It helps the time usually spent by business analysts who go to search in Excel to understand data definitions. Now that is something easily available as a part of the data catalog. Such things usually save license cost and money, as the time which business analysts are spending to get more information about the data dictionary is saved as part of the data catalog.

What's my experience with pricing, setup cost, and licensing?

My experience with pricing, setup cost, and licensing varies based on your relationship and the size of the cluster. So far, I would say that it is competitive pricing that we have received.

Which other solutions did I evaluate?

Before choosing Cloudera Data Platform, I did evaluate other options. Earlier, it was Cloudera, Hortonworks, and MapR, but nowadays, with Hortonworks and Cloudera merging, it is predominantly Cloudera Data Platform for big data on-premises.

What other advice do I have?

My advice for others looking into using Cloudera Data Platform is to consider the fact that it has been around for more than a decade, making it a very stable solution. If you want to go with the on-premises solution, that is the way you should go. If you are looking for a solution to deal with large volume, variety of data, and velocity of data including real-time data processing, that is something you should select with this platform. Based on the industry, there are various use cases available in their use case manual where particular use cases are more suitable for the customer's industry; they can also help you select the right services or the right product stack from Cloudera. It is all good, and you should leverage their professional services to get a better and more suitable product architecture. I would rate this product an eight out of ten.


    reviewer2774499

Has improved data analysis workflows and centralized sensitive information but needs faster adoption of latest technologies

  • November 04, 2025
  • Review provided by PeerSpot

What is our primary use case?

My main use case for Cloudera Data Platform is to host in-house data which is sensitive and very guard-railed for compliance.

A quick specific example of the type of sensitive data I'm hosting is related to personally identifiable information as well as data which is financial and transactional in nature, and Cloudera Data Platform helps with compliance by giving us a uniform approach to this. We have implemented the compliance-based entitlements using toolkits provided by Cloudera Data Platform and have our own implementation for each region where we are hosting the data.

What is most valuable?

The most unique thing about my setup with Cloudera Data Platform is having loads and loads of high-volume data, even for specific geographies, and using Cloudera Data Platform has helped us modernize that in a way where the tech is simpler for us.

The best features Cloudera Data Platform offers include their Kafka and Spark offerings, which we are using majorly, along with the Sqoop offering and a bit of Airflow here and there. The standout offering we have used from them is the Spark engine and the Impala engine to query our data, making the Impala cluster the best thing we have used from them.

Using the Spark and Impala engines makes my daily work easier and more efficient because Impala gives us a way to easily do analysis on the data, which simplifies the work of a business analyst as well as a PM when they are doing the initial analysis before the actual development begins for Spark, helping reduce the overall development cycle time.

Cloudera Data Platform has impacted my organization positively by providing cost-saving benefits, which is the North Star because of which we have shifted to it. We had data distributed across many platforms before starting this, and now the entire data strategy is designed around Cloudera Data Platform because it is very simple and very configurable.

What needs improvement?

A major drawback that I see with Cloudera Data Platform is that it tends to be one or two years behind the latest open-source implementation of the Apache toolkits such as Spark, Hive, and everything else, which relates to the guardrails of following compliance as the main driving factor, so that is something that could be done better on Cloudera Data Platform's side.

For how long have I used the solution?

I have been using Cloudera Data Platform for the last five years.

What do I think about the stability of the solution?

Cloudera Data Platform is stable most of the time.

What do I think about the scalability of the solution?

The scalability of Cloudera Data Platform is the best part about it, as you can just add, and the horizontal scaling is very fast, while vertical scaling takes a bit of time, but that works as well.

How are customer service and support?

The customer support for Cloudera Data Platform is okay; I would not say they are the best with technical abilities, however, they are able to address most of the issues.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Before choosing Cloudera Data Platform, we were using an IBM Netezza solution, which was working for the last ten years, but the costing was getting pretty high as we were scaling.

What was our ROI?

The major improvement we have is the consolidation of all the data models that we had, as now we have everything designed for Cloudera Data Platform, which saves a lot of time when we are doing exchanges between various teams.

Which other solutions did I evaluate?

I was not part of the exercise to evaluate other options before choosing Cloudera Data Platform, so I cannot comment on that.

What other advice do I have?

I would like to add that we are using Kafka in some cases across geographies as well, if the compliances allow us, and it has helped us maintain data pipelines more easily.

While I do not have exact numbers, I can tell you that we are processing more than a million data points every day on Cloudera Data Platform.

I don't have anything else to add about needed improvements regarding support, documentation, or other features.

My advice for others looking into using Cloudera Data Platform is to start with using the cloud and see if you can stay on the cloud if your domain allows, because being on the cloud gives you faster adoption to new technologies as well as distributed technical implementation, which provides more stability and flexibility.

I would rate this product a seven out of ten.


    Sajid Mehmood

Have managed data services efficiently while ensuring fast performance and reliability

  • October 30, 2025
  • Review provided by PeerSpot

What is our primary use case?

My main use case for Cloudera Data Platform is that I am a certified administrator. I use Cloudera Data Platform in my daily work by managing it as a whole in a Telco company. I regularly handle tasks by managing Cloudera Data Platform and being responsible for its services, which are currently up and running, and managing daily administrative tasks.

What is most valuable?

In my experience, the best features Cloudera Data Platform offers are that all the services provided are excellent.

A particular service that stands out to me in Cloudera Data Platform is the performance, which runs very fast. I also find very good features in data security, data reliability, and data lineage.

Cloudera Data Platform's Manager UI and other UIs are very useful and helpful for managing operations.

Cloudera Data Platform has positively impacted my organization as it comes in very handy while performing on big data and handling large files.

What needs improvement?

I think Cloudera Data Platform is good enough to run now, and I do not see areas for significant improvement.

I wish Cloudera Data Platform would add a service, apart from Ozone, to handle small files with faster performance, so as not to use Ozone or add extra hard disk capacity to the cluster.

For how long have I used the solution?

I have been using Cloudera Data Platform for approximately five years.

What do I think about the stability of the solution?

Cloudera Data Platform is very stable in my experience.

What do I think about the scalability of the solution?

Scalability of Cloudera Data Platform is very good and scalable in public cloud. However, it is not as scalable on on-premises private cloud, which adds considerable cost.

How are customer service and support?

I have interacted with the customer support team extensively, and they are very useful and helpful in resolving issues. I would rate the customer support of Cloudera Data Platform ten out of ten.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

Before choosing Cloudera Data Platform, my organization was using Teradata, and we did not evaluate other options.


    Review4321

Has supported multi-source data integration and enabled real-time analytics across hybrid environments

  • October 27, 2025
  • Review from a verified AWS customer

What is our primary use case?

The main use case for Cloudera Data Platform is to support a multi-source system with a multi-data structure. We have streaming services, Kafka services, RDBMS systems, and semi-structured data in the form of CSV and JSON files where we used to have everything in place and centralized.

Cloudera Data Platform also supports a hybrid data warehouse, which is similar to a relational database management system where business users can do query analytics, similar to a select star. Cloudera Data Platform also supports PySpark, where a user can create a data frame and then do a transformation load to perform and get insights.

What is most valuable?

The best features of Cloudera Data Platform are that it supports hybrid types of environments, real-time streaming analytics, secure data and governance, machine learning and AI workloads, data warehousing and BI, and edge-to-edge AI use cases.

In the hybrid environment, we can have a private cloud as well as a public cloud, which helps us enable both types of workloads. We have data that keeps coming through a pipeline, and then we just ingest our data. The data engineer transforms and loads it to a data lake, which is Amazon S3. Once the data is ready, it's on the downstream, and it's available for the consumer end to consume the data.

The most important features of Cloudera Data Platform are Rangers, which provide a granular level of security, allowing you to provide column-level security and decide what column you want to expose to the consumer, not just the tabular level.

Cloudera Data Platform has a great impact on my organization as it supports the business demand and business requirements, making me happy with the business use case. It depends on what the business demands and the business use case, which allows for an evaluation of what the business wants. Based on that, they can make a decision on where to go and where to migrate a workload.

What needs improvement?

I would definitely want to see more on the invention part of Cloudera Data Platform to provide a full-fledged AI and ML workload, as AI is supported currently, but I'm interested in having ML and LLM also supported in a full-fledged manner.

For how long have I used the solution?

I have been working in the current field for almost six to eight years.

What do I think about the stability of the solution?

Cloudera Data Platform is stable.

What do I think about the scalability of the solution?

Cloudera Data Platform's scalability is very nice, as you can have multiple workloads and even have multiple clusters with different CDP runtimes. You just have to define the business requirement in the configuration, and based on usage, it automatically scales up and scales down.

How are customer service and support?

Customer support for Cloudera Data Platform is very good.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We have been using a Cloudera distribution for Hadoop, which is a CDP product, a CDH product. The CDH product provided on-premises only, so we migrated from on-premises to the cloud to opt for cloud compute.

How was the initial setup?

The experience with pricing, setup cost, and licensing is very good. The cloud service provider has an inbuilt tool to analyze what zone and what region to use, as the services have costs associated with that, allowing us to manipulate which region is best suitable and cheaper.

What was our ROI?

In terms of ROI, we definitely have seen a return on investment. Due to security, we cannot disclose the value, but we have definitely seen an ROI.

What's my experience with pricing, setup cost, and licensing?

The experience with pricing, setup cost, and licensing is very good.

Which other solutions did I evaluate?

I did not evaluate other options before choosing Cloudera Data Platform.

What other advice do I have?

I would rate Cloudera Data Platform an eight out of ten because it's excellent in terms of the product, its deliverability, its support, and its use cases. It might differ for different industries depending on what each industry wants, but overall, it has a good impression, and I'm happy with the work relationship with Cloudera technical support.

If someone is looking for a hybrid environment or a cloud environment, they can definitely consider reviewing Cloudera Data Platform. They can look at all the aspects, as the Cloudera Data Platform ecosystem provides Apache Hive, HBase, Kafka, NiFi, Solr, and Knox, which they can review based on their business use case.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)


    T Sarwar

Has enabled efficient big data processing and querying but remains complex to manage and configure

  • October 24, 2025
  • Review provided by PeerSpot

What is our primary use case?

We are using Cloudera Data Platform to migrate and run our ETL processes, transferring data from multiple RDBMS to a data lake for analysis purposes. The current organization I work for is a top bank with a data lake of more than one petabyte.

Cloudera Data Platform is a perfect tool to manage such vast amounts of big data, store it properly, query it, and move it from one end to another.

What is most valuable?

The most useful feature I currently use from Cloudera Data Platform is the Hue tool, which provides a web-based utility. Users don't need network access approval when using on-premises internal access. Additionally, Spark and Impala are the most useful tools that I have used from Cloudera Data Platform.

The current organization I work for is a top bank with a data lake of more than one petabyte. For this specific purpose, Cloudera Data Platform is a perfect tool to manage such vast amounts of big data, store it properly, query it, and move it from one end to another.

What needs improvement?

Cloudera Data Platform should use fewer tools and remove the complexity between them. It should make it easier for the end user to change the configuration and understand it better.

The UI tool for jobs in Cloudera Data Platform can be improved to provide a proper image of ETL jobs and detailed consolidated graphs to monitor Spark-based Hue jobs.

For how long have I used the solution?

I have been working on a big data platform for the last five years, starting from 2020. Initially, I worked on Hortonworks platform for the last two to three years. Since Cloudera and Hortonworks merged into a single platform which is Cloudera Data Platform, I have been working on the CDP platform for the last two years.

What do I think about the stability of the solution?

We face downtime and reliability issues many times a week with Cloudera Data Platform because it is a very complex system and all configurations are managed by the end user. Sometimes the end user is not experienced or does not have all the expertise related to Cloudera specifically, making it very difficult to manage properly.

What do I think about the scalability of the solution?

For scalability, I would rate Cloudera Data Platform nine out of 10. We periodically have requirements to add resources or servers, and we find it very useful from a scalability perspective.

How are customer service and support?

The customer support from Cloudera is good when we receive support from non-Indian representatives. When support comes from Indian representatives, we receive level one support only.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Previously we were using Hive tool for querying engine, but since installing Impala, we witnessed huge performance improvements with Cloudera Data Platform. The majority of users are using Impala instead of Hive because Impala uses MPP, massively parallel processing technology.

Which other solutions did I evaluate?

If the only requirement is to have an on-premises system without other options, then Cloudera Data Platform is the best option available. However, if cloud is an option, I would prefer cloud more than on-premises Cloudera system.

What other advice do I have?

We are currently using Cloudera Data Platform with specific tools: Rangers to manage access-related items, HDFS to store files, Hive and Impala to access them, Hue as a query editor, and Spark for ETL execution.

It is a very complex system compared to cloud technology, which is much simpler. Due to this complexity, I rate Cloudera Data Platform six out of 10.


    Mohammad_Ahmad

Has improved resource efficiency and lowered costs but still lacks full AI workload support

  • October 16, 2025
  • Review from a verified AWS customer

What is our primary use case?

My main use case for Cloudera Data Platform is for data analytics and AI workload.

We have different data sources where the data is coming in tabular format or CSV, semi-structured or structured, unstructured, and some sort of Kafka streaming messages. We use to store it and then we process and transform, apply the business logic, and then make the data ready for the consumer to consume.

What is most valuable?

Cloudera Data Platform offers excellent architectures in terms of decoupling the storage layer from the compute. It is flexible in terms of scaling to your storage account or compute. Additionally, we have different streaming services as part of the ecosystem, and they have added Ranger for security controls, which is a valuable feature.

Decoupling storage from compute has helped my team significantly. Before using Cloudera Data Platform, we were using Cloudera Distribution for Hadoop (CDH), where we had to have on-premises virtual machines or Linux boxes to add to the cluster, which required lots of effort. We had defined authorized maximum storage per system; for example, one computer can have a maximum of 8 TB, and scaling up to add more compute to the cluster was very challenging. In the current Cloudera Data Platform, the backend storage is a data lake that auto-scales, so we don't have to add more storage. In terms of security, we used to use Sentry in traditional CDH, but in Cloudera Data Platform, Ranger provides more granular level of security, allowing us to manage who can access data at different levels, maybe at a tabular level or column level.

Streaming services are provided by NiFi, which is one of the best ecosystems for streaming and ETL support.

Cloudera Data Platform has positively impacted our organization by reducing overall manual intervention, requiring fewer efforts and resources to build a big data cluster compared to traditional methods. It is also cost-effective and more stable than the traditional ways of handling big data workload.

In terms of resources, we have reduced from ten resources to four or five resources, making it an effective reduction in manual effort. Regarding cost saving, since we are in the cloud, we are saving significant money compared to maintaining infrastructure on-premises.

What needs improvement?

Cloudera Data Platform could improve by innovating more in terms of full-fledged support for AI workloads, enriching machine learning or LLM, as there haven't been updates in that aspect over the last one and a half years.

For how long have I used the solution?

I have been using Cloudera Data Platform for almost four years.

What do I think about the scalability of the solution?

Cloudera Data Platform's scalability is very good.

How are customer service and support?

Customer support is good. However, having a common chat channel between firms and service providers would make communication faster and more efficient.

How would you rate customer service and support?

Neutral

What other advice do I have?

My advice to others looking into using Cloudera Data Platform is that if they are looking for big data workloads on the cloud where they can do analysis and achieve cost savings and resource reductions, it is definitely a good use case. It can vary based on business needs, but it is a good option for big data workloads.

I rated Cloudera Data Platform a six out of ten because I wish that it would keep up with market trends and release AI technology and AI-enabled workloads. Sometimes we struggle to get support, and having a common chat channel between firms and service providers would make communication and support more effective, especially in production.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?


    Ciro Porzio

Manages large-scale data ingestion and transformation while improving job performance in hybrid environments

  • October 10, 2025
  • Review provided by PeerSpot

What is our primary use case?

My main use case for Cloudera Data Platform is measuring HDFS and the SQL queries in Impala to troubleshoot some error in YARN applications based on Spark, and control the reporting data between Informatica and Cloudera for transport data between the DB Oracle, Mongo DB to CDP in Impala, between HDFS.

For measuring HDFS, I use Cloudera Data Platform, specifically Cloudera Manager, to analyze small files in HDFS to reduce our number for the duration of jobs that read this file and the partition date.

I mainly use Cloudera Data Platform as part of a large-scale data processing and analytics pipeline in a hybrid cloud environment, primarily on Azure, which involves managing the YARN cluster, monitoring workloads, troubleshooting performance issues, and integrating data ingestion and transformation processes from various enterprise systems. We leverage CDP for its scalability, security, and strong integration with Looker, Informatica, Hive, and Spark.

How has it helped my organization?

Cloudera Data Platform (CDP) has helped our organization improve data management consistency and scalability across multiple environments. The unified control plane and centralized governance have reduced operational overhead and made it easier to manage workloads between on-premise and cloud environments.

We’ve also seen clear benefits in resource optimization — auto-scaling and workload isolation features have allowed better use of infrastructure, while tools like Cloudera Manager and Workload XM improved monitoring and troubleshooting efficiency.

That said, there’s still room for improvement in integration speed and UI responsiveness, especially when managing large clusters or hybrid deployments.

What is most valuable?

In my opinion, the best features of Cloudera Data Platform are its strong integration, scalability, and unified management capabilities, while what stands out the most in Cloudera Manager are SDX, which provide centralized control for governance, security, and data lineage across multiple sources, simplifying operations significantly. Finally, the YARN and Spark resource management in CDP is robust and efficient, which is essential for handling heavy data transformation workloads at scale.

Cloudera Data Platform has positively impacted my organization by providing a unique storage point for a lot of data from various databases in HDFS. With Hive or Impala, it is possible to read and integrate data among all the other platforms, making it a great platform for us to have the data and create integrations.

What needs improvement?

I don't have any challenges or areas I think could use enhancement.

For how long have I used the solution?

I have been using Cloudera Data Platform for one year, and I have experience with the last version of Cloudera Data Platform for four years.

What was our ROI?

A specific example of the positive impact of Cloudera Data Platform is the clearly saved time and improved performance, which is the main result of it. The costs are increasing at the start of the project, but after securing, they are reduced, and the most significant benefit is the availability of data from governance and management.

What other advice do I have?

For the centralized governance of Spark management, we use a dashboard on SAS or Power BI to integrate the data that is stored in HDFS.

My advice to others looking into using Cloudera Data Platform is that it's a great product to save time and reduce costs in the long term.

On a scale of one to ten, I rate Cloudera Data Platform a nine.


    reviewer2763942

Has improved data accessibility and control but still needs better innovation for AI and machine learning

  • October 08, 2025
  • Review provided by PeerSpot

What is our primary use case?

My main use case for Cloudera Data Platform is data analytics and AI.

For data analytics and AI in my day-to-day work, we have a multi-source system where the data keeps coming from different source systems, from RDBMS, in tabular format, or semi-structured, or streaming data from Kafka. We process and store data in the backend ADLS, then apply business rule logic to create a golden table which is published for business or end users who consume the data for analytics. Some AI engineers develop or run that code, Python code, or LLM against those data to gain insights.

What is most valuable?

The most unique feature I love about Cloudera Data Platform is its integration with Ranger services. Ranger is more flexible compared to Cloudera's previous data distribution component, Sentry, making it more reliable and allowing for access control at a more granular level.

The Ranger integration makes it more flexible and reliable for me by allowing control over data access, specifying who can access at what level, such as table level, masking, or data layer level. This is crucial for managing all data inside the farm.

In terms of integration, it is very easy with Cloudera Data Platform. We just hook it up since it comes with the package when we install the CDP runtime, allowing us to select the ecosystem we want in our farm depending on our use cases. It is not a standalone installation requirement; it is an easy job. Scalability and flexibility are very good.

What needs improvement?

From a holistic view in the market, I have not seen enough innovation in Cloudera Data Platform, particularly in support for machine learning. It supports it, but not to a robust extent compared to other tech providers, such as Databricks, which are more flexible and in tune with current trends in AI and machine learning. I wish Cloudera would innovate and keep pace with market demands.

Regarding the user interface of Cloudera Data Platform, I have not faced any challenges, though we definitely look forward to innovation to support varied data models and scalability.

For how long have I used the solution?

I have been using Cloudera Data Platform for almost four years.

What do I think about the stability of the solution?

Cloudera Data Platform is generally stable; however, we occasionally face minor network connectivity issues as confirmed by the vendor. Sometimes a node goes down, but it automatically returns to a healthy state.

What do I think about the scalability of the solution?

Cloudera Data Platform has positively impacted my organization by eliminating challenges we faced with CDH, which had not been supported for a cloud journey. When adding scalability, such as horizontal scalability to our existing cluster, the process was time-consuming and required upfront costs for procuring servers. In contrast, CDP allows for easy, mostly automated scalability where I can schedule job workflows, fine-tune system resource metrics, and add nodes with just a click.

How are customer service and support?

Customer support depends on the case severity, but from my experience, it is great. Cloudera support is timely and responsive, adhering to the SLAs they provide.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Previously, we used Cloudera Data Distribution, known as CDH, which was on-premises and required more manual efforts among multiple teams, taking almost a month to set up a cluster. We switched primarily for cost-effectiveness, flexibility, and the reduced time required for setup.

How was the initial setup?

Cloudera Data Platform has positively impacted my organization by eliminating challenges we faced with CDH, which had not been supported for a cloud journey. When adding scalability, such as horizontal scalability to our existing cluster, the process was time-consuming and required upfront costs for procuring servers. In contrast, CDP allows for easy, mostly automated scalability where I can schedule job workflows, fine-tune system resource metrics, and add nodes with just a click.

What about the implementation team?

A solution architect from the vendor helps us resolve any ongoing issues such as bugs or vulnerabilities, and we appreciate the flexibility of the cloud journey.

What was our ROI?

In terms of return on investment, I see great changes in operational effectiveness measured by RTO when comparing on-premises solutions with cloud solutions. The difference is notable.

What's my experience with pricing, setup cost, and licensing?

I have not been involved overall in cost negotiation, but we find Cloudera Data Platform to be cost-effective. We work with the Cloudera vendor to secure one or two-year licenses upfront for discounts.

Which other solutions did I evaluate?

We evaluated Databricks three years ago, but it was not up to market standards in feature support at that time, particularly lacking an account console, which was introduced afterward. We have seen clients migrating from Cloudera to Databricks since the rollout of that console.

What other advice do I have?

My advice for those considering Cloudera Data Platform is to evaluate their business use case and budget, as these two factors are crucial. If the organization does not need advanced features such as LLM or machine learning, Cloudera Data Platform may be suitable. However, based on the current market, if rating between Databricks and Cloudera, I would give Databricks a one and Cloudera a two.

There are lots of challenges I face while using Cloudera Data Platform. Sometimes, vulnerabilities depend on which version of CDP runtime I am using, so we work with the Cloudera vendor side to remediate any vulnerabilities based on that version. Along with that, we use it for data audit purposes, gathering all inflow data such as how data is being used, who has access, and how many times.

In terms of cost savings with Cloudera Data Platform, moving from on-premises to cloud is very cost-effective. We can use bare metal servers or on-spot servers, which makes it economical. In performance, it is superior to previous versions since multiple Spark versions are added to the CDP runtime, improving data distribution, handling, and fault tolerance, requiring no code fine-tuning.

I rate Cloudera Data Platform six out of ten.


    Dhananjay Koyani

Processes large volumes of heterogeneous data efficiently but faces challenges in cloud adoption and future readiness

  • September 30, 2025
  • Review provided by PeerSpot

What is our primary use case?

Handling and processing big volumes of data is my main use case for Cloudera Data Platform.

We get the instrument data from various providers, and we process them, do reconciliation, and use Cloudera Data Platform to process it and ingest it in a structured manner which is then used by our downstream consumers.

One unique aspect about my main use case with Cloudera Data Platform involves multiple application teams building their workflows on the platform. I don't have all the insights into other aspects.

What is most valuable?

The best features Cloudera Data Platform offers are the processing power with Spark and the distributed data storage, HDFS, which helps us handle massive volumes of data.

Cloudera Data Platform has positively impacted my organization by making it easier to handle such a massive scale of data onto our existing data warehouse systems, allowing us to store heterogeneous data sources.

What needs improvement?

Cloudera Data Platform can be improved by addressing the feasibility of using it in the cloud; there are some complexities around the components used in cloud by Cloudera Data Platform that are not really convenient. If those can be resolved, it could be widely adopted, similar to Databricks.

Cloudera Data Platform is stable functionality-wise, but it needs some bug fixes for security, which we are expecting Cloudera to provide.

The scalability of Cloudera Data Platform could be enhanced.

For how long have I used the solution?

I have been using Cloudera Data Platform for around 10 years.

What do I think about the stability of the solution?

Cloudera Data Platform is stable functionality-wise, but it needs some bug fixes for security, which we are expecting Cloudera to provide.

What do I think about the scalability of the solution?

The scalability of Cloudera Data Platform could be enhanced.

How are customer service and support?

The customer support for Cloudera Data Platform is good.

How would you rate customer service and support?

Neutral

What other advice do I have?

I don't have any specific advice for others looking into using Cloudera Data Platform as I don't see any negatives coming to my mind.

On a scale of one to ten, I rate Cloudera Data Platform a seven out of ten.


    Shan Hasan

ETL processes benefit from cost-effective offloading and could see improved deployment capabilities

  • May 05, 2025
  • Review provided by PeerSpot

What is our primary use case?

The primary usage of Cloudera Data Platform is to offload ETL processes because it's cheaper compared to data warehouse solutions like Teradata or Oracle. Furthermore, basic reporting can be done, and some real-time processes can be managed.

What is most valuable?

The foremost benefit is offloading data from the warehouse to Cloudera Data Platform, which allows for cheaper storage. We use it to push transformations and run ETL processes, leveraging tools like Spark. Cloudera also supports various functionalities, including AI and Gen AI tools. Basic reporting and some real-time functions are manageable on the platform.

What needs improvement?

Cloudera Data Platform should include additional capabilities and features similar to those offered by other data management solutions like Azure and Databricks.

For how long have I used the solution?

I have been using Cloudera Data Platform for more than five years.

What was my experience with deployment of the solution?

The installation of Cloudera Data Platform had some challenges, but this is common with many products. An improved deployment process would help deliver solutions more quickly.

What do I think about the stability of the solution?

I would rate the stability of Cloudera Data Platform as eight out of ten.

What do I think about the scalability of the solution?

Integration with other tools works well for us and we successfully scaled the solution after two to three years without any issues. I would rate the scalability as eight out of ten.

How are customer service and support?

I have communicated with technical support, and they are responsive and helpful. I would rate their support as seven out of ten.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

Initially, the decision for Cloudera was driven by pricing and the support they provided.

How was the initial setup?

The initial setup may take several hours or days, depending on the challenges faced during installation. It's not always a smooth process due to potential complexities.

What about the implementation team?

The implementation involved multiple teams, including Cloudera support, with three to four people from our client's side involved.

What other advice do I have?

I recommend Cloudera Data Platform. Overall, I would rate it a seven out of ten despite the complexities in deployment. I suggest including my alternative email address for contact in case of access issues. The overall product rating is seven out of ten.