Processes large volumes of heterogeneous data efficiently but faces challenges in cloud adoption and future readiness
What is our primary use case?
Handling and processing big volumes of data is my main use case for Cloudera Data Platform.
We get the instrument data from various providers, and we process them, do reconciliation, and use Cloudera Data Platform to process it and ingest it in a structured manner which is then used by our downstream consumers.
One unique aspect about my main use case with Cloudera Data Platform involves multiple application teams building their workflows on the platform. I don't have all the insights into other aspects.
What is most valuable?
The best features Cloudera Data Platform offers are the processing power with Spark and the distributed data storage, HDFS, which helps us handle massive volumes of data.
Cloudera Data Platform has positively impacted my organization by making it easier to handle such a massive scale of data onto our existing data warehouse systems, allowing us to store heterogeneous data sources.
What needs improvement?
Cloudera Data Platform can be improved by addressing the feasibility of using it in the cloud; there are some complexities around the components used in cloud by Cloudera Data Platform that are not really convenient. If those can be resolved, it could be widely adopted, similar to Databricks.
Cloudera Data Platform is stable functionality-wise, but it needs some bug fixes for security, which we are expecting Cloudera to provide.
The scalability of Cloudera Data Platform could be enhanced.
For how long have I used the solution?
I have been using Cloudera Data Platform for around 10 years.
What do I think about the stability of the solution?
Cloudera Data Platform is stable functionality-wise, but it needs some bug fixes for security, which we are expecting Cloudera to provide.
What do I think about the scalability of the solution?
The scalability of Cloudera Data Platform could be enhanced.
How are customer service and support?
The customer support for Cloudera Data Platform is good.
What other advice do I have?
I don't have any specific advice for others looking into using Cloudera Data Platform as I don't see any negatives coming to my mind.
On a scale of one to ten, I rate Cloudera Data Platform a seven out of ten.
Reliable Platform for Managing Large-Scale Data Pipelines
What do you like best about the product?
Cloudera Data Engineering provides a solid environment for building and managing data pipelines at scale.
I like the way it integrates with Apache Spark and Airflow, making batch processing and scheduling efficient
What do you dislike about the product?
Initial setup and configuration can be complex, especially in hybrid cloud environments.
What problems is the product solving and how is that benefiting you?
We had challenges with slow and unreliable data processing in our ETL pipelines. With Cloudera Data Engineering, we were able to automate our workflows, schedule tasks reliably, and scale up when needed. This significantly improved our data delivery times and overall team productivity.
ETL processes benefit from cost-effective offloading and could see improved deployment capabilities
What is our primary use case?
The primary usage of
Cloudera Data Platform is to offload ETL processes because it's cheaper compared to data warehouse solutions like
Teradata or Oracle. Furthermore, basic reporting can be done, and some real-time processes can be managed.
What is most valuable?
The foremost benefit is offloading data from the warehouse to
Cloudera Data Platform, which allows for cheaper storage. We use it to push transformations and run ETL processes, leveraging tools like Spark. Cloudera also supports various functionalities, including AI and
Gen AI tools. Basic reporting and some real-time functions are manageable on the platform.
What needs improvement?
Cloudera Data Platform should include additional capabilities and features similar to those offered by other data management solutions like
Azure and
Databricks.
For how long have I used the solution?
I have been using Cloudera Data Platform for more than five years.
What was my experience with deployment of the solution?
The installation of Cloudera Data Platform had some challenges, but this is common with many products. An improved deployment process would help deliver solutions more quickly.
What do I think about the stability of the solution?
I would rate the stability of Cloudera Data Platform as eight out of ten.
What do I think about the scalability of the solution?
Integration with other tools works well for us and we successfully scaled the solution after two to three years without any issues. I would rate the scalability as eight out of ten.
How are customer service and support?
I have communicated with technical support, and they are responsive and helpful. I would rate their support as seven out of ten.
Which solution did I use previously and why did I switch?
Initially, the decision for Cloudera was driven by pricing and the support they provided.
How was the initial setup?
The initial setup may take several hours or days, depending on the challenges faced during installation. It's not always a smooth process due to potential complexities.
What about the implementation team?
The implementation involved multiple teams, including Cloudera support, with three to four people from our client's side involved.
What other advice do I have?
I recommend Cloudera Data Platform. Overall, I would rate it a seven out of ten despite the complexities in deployment. I suggest including my alternative email address for contact in case of access issues. The overall product rating is seven out of ten.
Distributed computing improves data processing while upgrade complexity needs addressing
What is our primary use case?
We heavily use Cloudera Data Platform for data science activities. Various departments in the company utilize it as a sandbox for data discovery. We have multiple data pipelines running on a daily and hourly basis, along with some real-time data pipelines.
What is most valuable?
Cloudera Data Platform has significantly improved our data management. Distributed computing with Spark has enabled many processing types that were not possible before. By using the Hadoop File System for distributed storage, we have 1.5 petabytes of physical storage with 500 terabytes of effective storage due to a replication factor of three.
What needs improvement?
There are challenges with upgrading or updating various services like Spark, Impala, and Hive on on-premise and bare metal solutions. We aim to address these issues with a Kubernetes-based platform that will simplify the task of upgrading services. We also wish to implement lakehouse capabilities with Iceberg or Delta Lake frameworks.
For how long have I used the solution?
I have been using Cloudera Data Platform since 2021. We began with a project a year prior, but it has been in production since then.
What do I think about the stability of the solution?
I would rate the stability of Cloudera Data Platform as seven out of ten.
What do I think about the scalability of the solution?
For scalability, I rate Cloudera Data Platform at an eight out of ten as it is an on-premise solution.
How are customer service and support?
I would rate the technical support from Cloudera as seven out of ten. Their support is helpful.
Which solution did I use previously and why did I switch?
Before Cloudera, we did not work with other big data platforms. This is our first big data platform, and we also have a classical data warehouse.
What about the implementation team?
We employed local vendors for the implementation, and from our company's side, around ten to twenty people were involved, including engineers, data scientists, and business personnel.
What's my experience with pricing, setup cost, and licensing?
The pricing model for Cloudera Data Platform is complex and has increased significantly compared to CDH. Initially, CDH had a straightforward pricing model based on nodes, but CDP includes factors like processors, cores, terabytes, and drives, making it difficult to calculate costs.
What other advice do I have?
For on-premise use, I would not recommend Cloudera Data Platform as it is expensive and complicated to upgrade. However, for cloud usage, I am uncertain as I do not use it on the cloud. Currently, around thirty to forty people use Cloudera Data Platform in our organization. My final rating for Cloudera Data Platform is seven out of ten.
Storage product by Cloudera
What do you like best about the product?
Usability and security is one of the core feature that I like the most with cloudera DB. Highly scalable and implementation is not a big deal.
What do you dislike about the product?
expensive and not supporting intranet to work.
What problems is the product solving and how is that benefiting you?
It is solving the very crucial problem of storing the data easily with high security . Being a research guy I have gone through many companies DB system and with no doubt this is one of the best and easiest to use.
Review CloudEra data
What do you like best about the product?
Cloudera Data Platform is like a super-smart organizer for data, helping companies handle lots of information easily and securely. It works well with big data and lets businesses analyze and use their data smartly, making decisions based on facts.
What do you dislike about the product?
Some folks find Cloudera Data Platform a bit tricky to set up and costly to run, like a high-maintenance gadget. It might feel a bit complicated for beginners, and the buttons aren't as easy to figure out as some other tools.
What problems is the product solving and how is that benefiting you?
Cloudera Data Platform helps tp organize and make sense of their big data, making it easier to find valuable insights and make smart decisions.
Big data technology leader
What do you like best about the product?
The way it bundled useful big data technologies in one product and easy to install and use.
What do you dislike about the product?
CDP is too costly compared to open source software present.
What problems is the product solving and how is that benefiting you?
Solving problems related to big data.
This is my short review for Cloudera platform.
What do you like best about the product?
Cloudera stands out as a robust platform, especially for aspiring data engineers like me. The user-friendly interface coupled with robust security features makes it a valuable asset for me.
What do you dislike about the product?
While Cloudera offers powerful tools, navigating through its interface can be a bit overwhelming initially.
What problems is the product solving and how is that benefiting you?
Currently i am running a business just learning new things and best approaches.
Cloudera Platform Review
What do you like best about the product?
Cloudera is known for its comprehensive suite of tools for big data management and analytics. Many users appreciate its scalability, robustness, and the ease with which it integrates various big data technologies.
What do you dislike about the product?
However, some users have mentioned challenges with the complexity of setting up and maintaining the Cloudera Data Platform, as well as concerns about licensing costs.
What problems is the product solving and how is that benefiting you?
Unified Data Management
Scalability
Advance analytics
Very powerful data platform
What do you like best about the product?
We used the Cloudera Data Platform to create a robust solution addressing challenges in the extraction, ingestion, and utilization of CSV data, providing a seamless experience for end-users for one of our Projects. We used it to automate the data extraction process and to ingest the processed data into an external table on the Cloudera Data platform to be used for front-end reporting using Plotly Dash.
What do you dislike about the product?
We didn't particularly dislike anything about the tool. Overall it was a very effective and powerful tool.
What problems is the product solving and how is that benefiting you?
We used the Cloudera Data Platform to create a robust solution addressing challenges in the extraction, ingestion, and utilization of CSV data, providing a seamless experience for end-users
It helped us to solve the problem of data extraction. It also helped us in creating a process for data cleansing and migrating the cleansed data into an external table.