Listing Thumbnail

    Cloudera on AWS

     Info
    Sold by: Cloudera 
    Deployed on AWS
    An enterprise data cloud that manages, secures and connects the data lifecycle in AWS. Cloudera delivers powerful self-service analytics across hybrid and multi-cloud environments, along with sophisticated and granular security and governance policies that IT and data leaders demand.

    Overview

    Play video

    Cloudera on AWS is an enterprise data platform that is easy to deploy, manage, and use. By simplifying operations, Cloudera reduces the time to onboard new use cases. Cloudera manages data in any environment, including multiple public clouds, private cloud, and hybrid cloud. With Cloudera's Shared Data Experience (SDX), IT can confidently deliver secure and governed analytics running against data anywhere. Cloudera is a new approach to enterprise data, running anywhere from the Edge to AI.

    Cloudera on AWS delivers easy-to-use analytics that support the most complex, demanding use cases

    Complete: All functions needed to ingest, transform, query, optimize, and make predictions from data are available, eliminating the need for point products

    Integrated: Unified analytic functions work together eliminating data silos and copies of data

    Cloudera SDX technologies ensures and enterprise data cloud is secure by design:

    Consistency: Security and governance policies are set once and applied across all data and workloads

    Portability: Policies stay with the data even as it moves across all supported infrastructures

    Pricing: Use of Cloudera on AWS requires a prepay commitment (in dollars) of cloud credits. For more information on usage rates and instance types, see cloudera.com/products/pricing.html.

    You may use the platform until your commitment is consumed (used against prepaid commitment amount), any additional usage beyond the prepaid commitment will require negotiation with Cloudera for the purchase of additional prepaid credits.

    Highlights

    • Provides elasticity, agility, and ease of use for hybrid and public cloud by intelligently autoscaling workloads up and down for more cost-effective use of cloud infrastructure. Consistent user experience makes it faster and easier to analyze data.
    • Optimizes the data lifecycle with multi-function analytics that solves demanding business use cases. Cloudera on AWS is composed of three primary services with a standardized user experience: Data Warehouse, Machine Learning and Data Hub for custom analytics.
    • Ensures all workloads on the platform share common security, governance, and metadata. Users can efficiently find, curate, and share data, enabling self-service access to trusted data and analytics

    Details

    Delivery method

    Deployed on AWS

    Unlock automation with AI agent solutions

    Fast-track AI initiatives with agents, tools, and solutions from AWS Partners.
    AI Agents

    Features and programs

    Buyer guide

    Gain valuable insights from real users who purchased this product, powered by PeerSpot.
    Buyer guide

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Cloudera on AWS

     Info
    Pricing is based on the duration and terms of your contract with the vendor, and additional usage. You pay upfront or in installments according to your contract terms with the vendor. This entitles you to a specified quantity of use for the contract duration. Usage-based pricing is in effect for overages or additional usage not covered in the contract. These charges are applied on top of the contract price. If you choose not to renew or replace your contract before the contract end date, access to your entitlements will expire.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    12-month contract (1)

     Info
    Dimension
    Description
    Cost/12 months
    Cloudera
    Subscription Cloudera on AWS
    $50,000.00

    Additional usage costs (1)

     Info

    The following dimensions are not included in the contract terms, which will be charged based on your usage.

    Dimension
    Cost/unit
    Consumption by Customer based on Cloud usage
    $0.01

    Vendor refund policy

    No refunds available

    Custom pricing options

    Request a private offer to receive a custom quote.

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    Software as a Service (SaaS)

    SaaS delivers cloud-based software applications directly to customers over the internet. You can access these applications through a subscription model. You will pay recurring monthly usage fees through your AWS bill, while AWS handles deployment and infrastructure management, ensuring scalability, reliability, and seamless integration with other AWS services.

    Resources

    Support

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Product comparison

     Info
    Updated weekly

    Accolades

     Info
    Top
    10
    In Data Analysis
    Top
    10
    In Databases & Analytics Platforms, ML Solutions, Data Analytics
    Top
    10
    In Data Warehouses

    Customer reviews

     Info
    Sentiment is AI generated from actual customer reviews on AWS and G2
    Reviews
    Functionality
    Ease of use
    Customer service
    Cost effectiveness
    Positive reviews
    Mixed reviews
    Negative reviews

    Overview

     Info
    AI generated from product descriptions
    Data Platform Architecture
    Enterprise data platform supporting multi-cloud, hybrid cloud, and on-premises data management environments
    Security and Governance Framework
    Shared Data Experience (SDX) technology providing consistent security and governance policies across data workloads and infrastructures
    Multi-Function Analytics
    Integrated analytics platform supporting data ingestion, transformation, querying, optimization, and predictive modeling without requiring separate point products
    Workload Optimization
    Intelligent autoscaling capabilities for dynamically adjusting cloud infrastructure resources based on computational requirements
    Data Lifecycle Management
    Comprehensive platform supporting data processing across multiple services including Data Warehouse, Machine Learning, and custom analytics environments
    Data Platform Architecture
    Unified platform integrating data engineering, analytics, business intelligence, data science, and machine learning on a single architecture
    Open Source Foundation
    Built on open source data projects with support for open standards and data formats
    Lakehouse Infrastructure
    Provides a common data management approach using a lakehouse architecture running on Amazon S3
    Data Intelligence Engine
    Advanced engine capable of interpreting organizational data context and enabling broad data access across teams
    Collaborative Workflow
    Native collaboration capabilities enabling cross-functional data and AI workflow integration
    Data Lake Query Performance
    Provides sub-second query response times using SQL query service on data lake platforms
    Open Standards Support
    Utilizes community-driven standards like Apache Iceberg and Apache Arrow for processing engines
    Multi-Source Data Integration
    Enables joining data from data lakes and external databases without data movement
    Compute Engine Management
    Automatically handles compute engine lifecycle including provisioning, scaling, pausing, and decommissioning
    VPC-Based Data Processing
    Deploys compute engines within customer's Amazon Virtual Private Cloud for secure data processing

    Security credentials

     Info
    Validated by AWS Marketplace
    FedRAMP
    GDPR
    HIPAA
    ISO/IEC 27001
    PCI DSS
    SOC 2 Type 2
    No security profile
    No security profile
    -
    -
    -
    -

    Contract

     Info
    Standard contract
    No
    No
    No

    Customer reviews

    Ratings and reviews

     Info
    2
    2 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    0%
    0%
    50%
    0%
    50%
    2 AWS reviews
    |
    44 external reviews
    Star ratings include only reviews from verified AWS customers. External reviews can also include a star rating, but star ratings from external reviews are not averaged in with the AWS customer star ratings.
    Sajid Mehmood

    Have managed data services efficiently while ensuring fast performance and reliability

    Reviewed on Oct 30, 2025
    Review provided by PeerSpot

    What is our primary use case?

    My main use case for Cloudera Data Platform  is that I am a certified administrator. I use Cloudera Data Platform  in my daily work by managing it as a whole in a Telco company. I regularly handle tasks by managing Cloudera Data Platform and being responsible for its services, which are currently up and running, and managing daily administrative tasks.

    What is most valuable?

    In my experience, the best features Cloudera Data Platform offers are that all the services provided are excellent.

    A particular service that stands out to me in Cloudera Data Platform is the performance, which runs very fast. I also find very good features in data security, data reliability, and data lineage.

    Cloudera Data Platform's Manager UI and other UIs are very useful and helpful for managing operations.

    Cloudera Data Platform has positively impacted my organization as it comes in very handy while performing on big data and handling large files.

    What needs improvement?

    I think Cloudera Data Platform is good enough to run now, and I do not see areas for significant improvement.

    I wish Cloudera Data Platform would add a service, apart from Ozone , to handle small files with faster performance, so as not to use Ozone  or add extra hard disk capacity to the cluster.

    For how long have I used the solution?

    I have been using Cloudera Data Platform for approximately five years.

    What do I think about the stability of the solution?

    Cloudera Data Platform is very stable in my experience.

    What do I think about the scalability of the solution?

    Scalability of Cloudera Data Platform is very good and scalable in public cloud. However, it is not as scalable on on-premises private cloud, which adds considerable cost.

    How are customer service and support?

    I have interacted with the customer support team extensively, and they are very useful and helpful in resolving issues. I would rate the customer support of Cloudera Data Platform ten out of ten.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    Before choosing Cloudera Data Platform, my organization was using Teradata , and we did not evaluate other options.

    Which deployment model are you using for this solution?

    Private Cloud and On-premises

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Other
    Mohammadfaizan Faizan

    Has supported multi-source data integration and enabled real-time analytics across hybrid environments

    Reviewed on Oct 27, 2025
    Review from a verified AWS customer

    What is our primary use case?

    The main use case for Cloudera Data Platform is to support a multi-source system with a multi-data structure. We have streaming services, Kafka services, RDBMS systems, and semi-structured data in the form of CSV and JSON files where we used to have everything in place and centralized.

    Cloudera Data Platform also supports a hybrid data warehouse, which is similar to a relational database management system where business users can do query analytics, similar to a select star. Cloudera Data Platform also supports PySpark, where a user can create a data frame and then do a transformation load to perform and get insights.

    What is most valuable?

    The best features of Cloudera Data Platform are that it supports hybrid types of environments, real-time streaming analytics, secure data and governance, machine learning and AI workloads, data warehousing and BI, and edge-to-edge AI use cases.

    In the hybrid environment, we can have a private cloud as well as a public cloud, which helps us enable both types of workloads. We have data that keeps coming through a pipeline, and then we just ingest our data. The data engineer transforms and loads it to a data lake, which is Amazon S3. Once the data is ready, it's on the downstream, and it's available for the consumer end to consume the data.

    The most important features of Cloudera Data Platform are Rangers, which provide a granular level of security, allowing you to provide column-level security and decide what column you want to expose to the consumer, not just the tabular level.

    Cloudera Data Platform has a great impact on my organization as it supports the business demand and business requirements, making me happy with the business use case. It depends on what the business demands and the business use case, which allows for an evaluation of what the business wants. Based on that, they can make a decision on where to go and where to migrate a workload.

    What needs improvement?

    I would definitely want to see more on the invention part of Cloudera Data Platform to provide a full-fledged AI and ML workload, as AI is supported currently, but I'm interested in having ML and LLM also supported in a full-fledged manner.

    For how long have I used the solution?

    I have been working in the current field for almost six to eight years.

    What do I think about the stability of the solution?

    Cloudera Data Platform is stable.

    What do I think about the scalability of the solution?

    Cloudera Data Platform's scalability is very nice, as you can have multiple workloads and even have multiple clusters with different CDP runtimes. You just have to define the business requirement in the configuration, and based on usage, it automatically scales up and scales down.

    How are customer service and support?

    Customer support for Cloudera Data Platform is very good.

    How would you rate customer service and support?

    Positive

    Which solution did I use previously and why did I switch?

    We have been using a Cloudera distribution for Hadoop, which is a CDP product, a CDH product. The CDH product provided on-premises only, so we migrated from on-premises to the cloud to opt for cloud compute.

    How was the initial setup?

    The experience with pricing, setup cost, and licensing is very good. The cloud service provider has an inbuilt tool to analyze what zone and what region to use, as the services have costs associated with that, allowing us to manipulate which region is best suitable and cheaper.

    What was our ROI?

    In terms of ROI, we definitely have seen a return on investment. Due to security, we cannot disclose the value, but we have definitely seen an ROI.

    What's my experience with pricing, setup cost, and licensing?

    The experience with pricing, setup cost, and licensing is very good.

    Which other solutions did I evaluate?

    I did not evaluate other options before choosing Cloudera Data Platform.

    What other advice do I have?

    I would rate Cloudera Data Platform an eight out of ten because it's excellent in terms of the product, its deliverability, its support, and its use cases. It might differ for different industries depending on what each industry wants, but overall, it has a good impression, and I'm happy with the work relationship with Cloudera technical support.

    If someone is looking for a hybrid environment or a cloud environment, they can definitely consider reviewing Cloudera Data Platform. They can look at all the aspects, as the Cloudera Data Platform ecosystem provides Apache Hive, HBase, Kafka, NiFi, Solr, and Knox, which they can review based on their business use case.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Amazon Web Services (AWS)
    T Sarwar

    Has enabled efficient big data processing and querying but remains complex to manage and configure

    Reviewed on Oct 24, 2025
    Review provided by PeerSpot

    What is our primary use case?

    We are using Cloudera Data Platform  to migrate and run our ETL processes, transferring data from multiple RDBMS  to a data lake for analysis purposes. The current organization I work for is a top bank with a data lake of more than one petabyte.

    Cloudera Data Platform  is a perfect tool to manage such vast amounts of big data, store it properly, query it, and move it from one end to another.

    What is most valuable?

    The most useful feature I currently use from Cloudera Data Platform is the Hue tool, which provides a web-based utility. Users don't need network access approval when using on-premises internal access. Additionally, Spark and Impala are the most useful tools that I have used from Cloudera Data Platform.

    The current organization I work for is a top bank with a data lake of more than one petabyte. For this specific purpose, Cloudera Data Platform is a perfect tool to manage such vast amounts of big data, store it properly, query it, and move it from one end to another.

    What needs improvement?

    Cloudera Data Platform should use fewer tools and remove the complexity between them. It should make it easier for the end user to change the configuration and understand it better.

    The UI tool for jobs in Cloudera Data Platform can be improved to provide a proper image of ETL jobs and detailed consolidated graphs to monitor Spark-based Hue jobs.

    For how long have I used the solution?

    I have been working on a big data platform for the last five years, starting from 2020. Initially, I worked on Hortonworks platform for the last two to three years. Since Cloudera and Hortonworks merged into a single platform which is Cloudera Data Platform, I have been working on the CDP platform for the last two years.

    What do I think about the stability of the solution?

    We face downtime and reliability issues many times a week with Cloudera Data Platform because it is a very complex system and all configurations are managed by the end user. Sometimes the end user is not experienced or does not have all the expertise related to Cloudera specifically, making it very difficult to manage properly.

    What do I think about the scalability of the solution?

    For scalability, I would rate Cloudera Data Platform nine out of 10. We periodically have requirements to add resources or servers, and we find it very useful from a scalability perspective.

    How are customer service and support?

    The customer support from Cloudera is good when we receive support from non-Indian representatives. When support comes from Indian representatives, we receive level one support only.

    How would you rate customer service and support?

    Neutral

    Which solution did I use previously and why did I switch?

    Previously we were using Hive  tool for querying engine, but since installing Impala, we witnessed huge performance improvements with Cloudera Data Platform. The majority of users are using Impala instead of Hive  because Impala uses MPP, massively parallel processing technology.

    Which other solutions did I evaluate?

    If the only requirement is to have an on-premises system without other options, then Cloudera Data Platform is the best option available. However, if cloud is an option, I would prefer cloud more than on-premises Cloudera system.

    What other advice do I have?

    We are currently using Cloudera Data Platform with specific tools: Rangers to manage access-related items, HDFS to store files, Hive and Impala to access them, Hue as a query editor, and Spark for ETL execution.

    It is a very complex system compared to cloud technology, which is much simpler. Due to this complexity, I rate Cloudera Data Platform six out of 10.

    Which deployment model are you using for this solution?

    On-premises
    Mohammad_Ahmad

    Has improved resource efficiency and lowered costs but still lacks full AI workload support

    Reviewed on Oct 16, 2025
    Review from a verified AWS customer

    What is our primary use case?

    My main use case for Cloudera Data Platform  is for data analytics and AI workload.

    We have different data sources where the data is coming in tabular format or CSV, semi-structured or structured, unstructured, and some sort of Kafka streaming messages. We use to store it and then we process and transform, apply the business logic, and then make the data ready for the consumer to consume.

    What is most valuable?

    Cloudera Data Platform  offers excellent architectures in terms of decoupling the storage layer from the compute. It is flexible in terms of scaling to your storage account or compute. Additionally, we have different streaming services as part of the ecosystem, and they have added Ranger for security controls, which is a valuable feature.

    Decoupling storage from compute has helped my team significantly. Before using Cloudera Data Platform, we were using Cloudera Distribution for Hadoop  (CDH), where we had to have on-premises virtual machines or Linux boxes to add to the cluster, which required lots of effort. We had defined authorized maximum storage per system; for example, one computer can have a maximum of 8 TB, and scaling up to add more compute to the cluster was very challenging. In the current Cloudera Data Platform, the backend storage is a data lake that auto-scales, so we don't have to add more storage. In terms of security, we used to use Sentry  in traditional CDH, but in Cloudera Data Platform, Ranger provides more granular level of security, allowing us to manage who can access data at different levels, maybe at a tabular level or column level.

    Streaming services are provided by NiFi, which is one of the best ecosystems for streaming and ETL support.

    Cloudera Data Platform has positively impacted our organization by reducing overall manual intervention, requiring fewer efforts and resources to build a big data cluster compared to traditional methods. It is also cost-effective and more stable than the traditional ways of handling big data workload.

    In terms of resources, we have reduced from ten resources to four or five resources, making it an effective reduction in manual effort. Regarding cost saving, since we are in the cloud, we are saving significant money compared to maintaining infrastructure on-premises.

    What needs improvement?

    Cloudera Data Platform could improve by innovating more in terms of full-fledged support for AI workloads, enriching machine learning or LLM, as there haven't been updates in that aspect over the last one and a half years.

    For how long have I used the solution?

    I have been using Cloudera Data Platform for almost four years.

    What do I think about the scalability of the solution?

    Cloudera Data Platform's scalability is very good.

    How are customer service and support?

    Customer support is good. However, having a common chat channel between firms and service providers would make communication faster and more efficient.

    How would you rate customer service and support?

    Neutral

    What other advice do I have?

    My advice to others looking into using Cloudera Data Platform is that if they are looking for big data workloads on the cloud where they can do analysis and achieve cost savings and resource reductions, it is definitely a good use case. It can vary based on business needs, but it is a good option for big data workloads.

    I rated Cloudera Data Platform a six out of ten because I wish that it would keep up with market trends and release AI technology and AI-enabled workloads. Sometimes we struggle to get support, and having a common chat channel between firms and service providers would make communication and support more effective, especially in production.

    Which deployment model are you using for this solution?

    Public Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Ciro Porzio

    Manages large-scale data ingestion and transformation while improving job performance in hybrid environments

    Reviewed on Oct 10, 2025
    Review provided by PeerSpot

    What is our primary use case?

    My main use case for Cloudera Data Platform  is measuring HDFS and the SQL queries in Impala to troubleshoot some error in YARN applications based on Spark, and control the reporting data between Informatica and Cloudera for transport data between the DB Oracle, Mongo DB to CDP in Impala, between HDFS.

    For measuring HDFS, I use Cloudera Data Platform , specifically Cloudera Manager, to analyze small files in HDFS to reduce our number for the duration of jobs that read this file and the partition date.

    I mainly use Cloudera Data Platform as part of a large-scale data processing and analytics pipeline in a hybrid cloud environment, primarily on Azure , which involves managing the YARN cluster, monitoring workloads, troubleshooting performance issues, and integrating data ingestion and transformation processes from various enterprise systems. We leverage CDP for its scalability, security, and strong integration with Looker , Informatica, Hive , and Spark.

    How has it helped my organization?

    Cloudera Data Platform (CDP) has helped our organization improve data management consistency and scalability across multiple environments. The unified control plane and centralized governance have reduced operational overhead and made it easier to manage workloads between on-premise and cloud environments.

    We’ve also seen clear benefits in resource optimization — auto-scaling and workload isolation features have allowed better use of infrastructure, while tools like Cloudera Manager and Workload XM improved monitoring and troubleshooting efficiency.

    That said, there’s still room for improvement in integration speed and UI responsiveness, especially when managing large clusters or hybrid deployments.

    What is most valuable?

    In my opinion, the best features of Cloudera Data Platform are its strong integration, scalability, and unified management capabilities, while what stands out the most in Cloudera Manager are SDX, which provide centralized control for governance, security, and data lineage across multiple sources, simplifying operations significantly. Finally, the YARN and Spark resource management in CDP is robust and efficient, which is essential for handling heavy data transformation workloads at scale.

    Cloudera Data Platform has positively impacted my organization by providing a unique storage point for a lot of data from various databases in HDFS. With Hive  or Impala, it is possible to read and integrate data among all the other platforms, making it a great platform for us to have the data and create integrations.

    What needs improvement?

    I don't have any challenges or areas I think could use enhancement.

    For how long have I used the solution?

    I have been using Cloudera Data Platform for one year, and I have experience with the last version of Cloudera Data Platform for four years.

    What was our ROI?

    A specific example of the positive impact of Cloudera Data Platform is the clearly saved time and improved performance, which is the main result of it. The costs are increasing at the start of the project, but after securing, they are reduced, and the most significant benefit is the availability of data from governance and management.

    What other advice do I have?

    For the centralized governance of Spark management, we use a dashboard on SAS or Power BI to integrate the data that is stored in HDFS.

    My advice to others looking into using Cloudera Data Platform is that it's a great product to save time and reduce costs in the long term.

    On a scale of one to ten, I rate Cloudera Data Platform a nine.

    Which deployment model are you using for this solution?

    Hybrid Cloud

    If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

    Microsoft Azure
    View all reviews