Sign in Agent Mode
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

9 AWS reviews

External reviews

13 reviews
from

External reviews are not included in the AWS star rating for the product.


3-star reviews ( Show all reviews )

    reviewer2785122

Columnar analytics have boosted on‑prem insights but installation and documentation still need work

  • December 06, 2025
  • Review provided by PeerSpot

What is our primary use case?

I am Joachim from Lasersoft Technologies, and I'm mostly working with data, so my designation is similar to a software engineer. I'm currently working on a migration project. Before that, I was working with an ETL pipeline. Basically, I'm not working with ClickHouse in the migration project, but as I mentioned before, the ETL pipeline has sales-related data. That's why I'm using ClickHouse database. It's mostly used in a server.

Mostly, I'm using ClickHouse for data warehousing, as we need to fetch data and load it into ClickHouse for analytical purposes. We need to use it for group by aggregation purposes, utilizing warehousing concepts, not for workflows.

Basically, we need to fetch data from an API, which consists of around 10,000 to 20,000 records per day. We need to load it into ClickHouse for analytical purposes, where we load the data into ClickHouse and fetch it using the Power BI JDBC driver to analyze the data for the client. It's very useful for us for analytical and group by operations. It's much faster than transactional databases, so we need to use that.

What is most valuable?

The best features ClickHouse offers are basically for aggregation and data group by functions, which is why we are using ClickHouse, as it's more oriented towards analyzing the data. When we use a group by or Windows function, it's much faster than a transactional database. The main best feature is that we have to install it on-premises, which is a big advantage for data warehousing that's created on on-premises servers.

I've seen that the speed and performance of these features have helped my work significantly, especially compared to other databases I've used, such as SQL Server and Cosmos DB. With ClickHouse, since data is stored in a columnar way, we get aggregation functions that are much faster than transactional databases, such as SQL Server. Cosmos DB is more NoSQL, so you can't query as much as you can in SQL.

ClickHouse has positively impacted my organization, as I've seen a lot of improvement on the analytical side. Before, we used Cosmos DB for Power BI analytics, but after we started using ClickHouse, it's much faster than Cosmos DB for analytical purposes. The cost efficiency is also much reduced compared to Cosmos DB. Since we use it on-premises, the cost is nearly cut, which is very useful for us.

What needs improvement?

In terms of how ClickHouse can be improved, I don't think there are any improvements needed. However, in terms of challenges, the installation and setup of the database need attention. We need to install it on-premises, and it's more difficult. A single-click installation that automatically gets what we need would be helpful. That's more of a suggestion than an improvement or a challenge.

In terms of needed improvements, some enhancements in documentation are necessary. ClickHouse still doesn't support surrogate keys. I'm not that aware of it today, but a year ago when I was using it, ClickHouse database tables did not support surrogate keys. I'm not sure if it's still an issue, but that was the case. We also need more documentation added to the website and more videos or tutorial videos added to ClickHouse YouTube channel. It would be useful for most people.

Some features are still not supported in ClickHouse, such as surrogate keys. I'm not sure if it's supported now, but some features need to be added, and some tutorial videos need to be added to the YouTube channels, or the documentation needs to be better. It would be useful for people who are using ClickHouse for an on-premises database. We need more documentation.

For how long have I used the solution?

I've been working with ClickHouse for around one to two years.

What do I think about the stability of the solution?

ClickHouse is stable in my experience, but it needed some improvement when I used it. I'm not aware of the current state, but it's mostly stable.

What do I think about the scalability of the solution?

Scalability-wise, ClickHouse is good.

How are customer service and support?

I haven't needed customer support that much, as we haven't had to communicate with customer support.

How would you rate customer service and support?

Negative

Which solution did I use previously and why did I switch?

We used Cosmos DB and needed to switch to a columnar database. We needed an on-premises database, so we switched to ClickHouse for that.

How was the initial setup?

We use an on-premises ClickHouse database and don't use it in a cloud-based way, so we don't have practical ideas for the cloud-based version.

What about the implementation team?

We deploy ClickHouse in our organization on on-premises servers, so it's not cloud-based and is used on-premises for analytical purposes. It's more oriented towards reducing the cost, which is why it's used on-premises in our organization.

What was our ROI?

I've seen a return on investment, with money saved metrics around 50% to 75% for us by using ClickHouse. The return on investment is around 25% to 30%. For time-saving metrics, it's around 30% to 40% of time saved compared to transactional databases. The setup is also a time-saver for ClickHouse.

What's my experience with pricing, setup cost, and licensing?

Pricing-wise, we are using ClickHouse on-premises, so we just need to maintain the on-premises servers. The setup is very easy compared to other servers, but it's more oriented toward using ClickHouse on Linux servers only. If it were supported on Windows, it might be better for those who use Windows-based products. The licensing is also good compared to others.

Which other solutions did I evaluate?

Before choosing ClickHouse, we did evaluate other options such as BigQuery or Azure warehousing concepts, which are more oriented towards spending money, but we needed a cost-reduced, on-premises solution. That's why we were ready to use ClickHouse for our analytics.

What other advice do I have?

The advice I would give to others looking into using ClickHouse is to research the documentation before using it to see what use cases are needed. If they need an on-premises solution, as of today, ClickHouse is the best option. They need to think about on-premises versus cloud-based. We mostly use it on-premises. I have provided a review rating of 7 for ClickHouse.


    Yush Mittal

Data observability has enabled real‑time analytics and cost savings but needs smoother inserts and cleanup

  • December 05, 2025
  • Review from a verified AWS customer

What is our primary use case?

ClickHouse has been used for more than a year as the primary solution for an observability data platform hosted on AWS EC2 instances. Data flows from S3 through Kinesis Data Streams and is processed in DataBricks using Medallion architecture, which includes bronze, silver, and gold layers. Once the gold layer is finalized, the data is sent to the ClickHouse cluster hosted on EC2 nodes.

ClickHouse serves as the data sync solution in the workflow, processing data from over 9,000 retail stores from CVS Health in JSON format to make it meaningful for analytical purposes. This data is stored in ClickHouse in the form of multiple tables, such as logs, events, metrics, and traces, and is used for reporting purposes in Grafana and for ML model training.

ClickHouse also hosts as a database sync for the front-end application. A chatbot prompt has been developed on this data that has been hosted, with the model trained based on it. When a user sends a text prompt, the model sends a more generalized or granular query to get the desired results quickly, so the user does not have to write their own query. Instead, the user just needs to provide the prompt of what kind of query they are looking for, and the AI prompt will provide the required query.

What is most valuable?

ClickHouse offers several best features, including the S3 engine function and the ReplicatedMergeTree function. Instead of storing data in EBS volumes, data is stored mainly in S3 Express, which provides computation speed comparable to an EBS volume. The ability to configure ClickHouse using XML configs is also highly appreciated. With an 18-node cluster, multiple zone replication can be configured across three availability zones in US East 1. The configs allow data to be stored in multiple locations separately, and with ReplicatedMergeTree, data needs to be sent to only six nodes, and that gets replicated automatically to all 18 nodes. These features are beneficial, and since ClickHouse is an OLAP database, it provides faster analytical speeds, which is really suitable for this use case.

For the S3 engine, a fault-tolerant data storage pattern was sought, as the data is highly secure and contains PII data. The concern was that data should not be lost or be in remote setups so that even if a failure occurs, data is not lost or accessed by unauthorized individuals. S3 engine was considered a very encrypted and fault-tolerant solution, and it has proven to be very reliable for storing data in ClickHouse. Regarding ReplicatedMergeTree, high write and read rates are dealt with because a real-time analytical application is being created, so avoiding reliance on all 18 nodes for read and write simultaneously was important. ReplicatedMergeTree helps maintain an isolated environment for writing on specific nodes while allowing all nodes to participate in read queries, which is how both S3 engine and ReplicatedMergeTree have helped.

ClickHouse provides great query speeds because it is an OLAP database, so naturally, it provides higher speeds. For cost optimization, after deploying the cluster on-premises and using S3 Express, approximately 5x cost savings were achieved on data storage. Initially, a budget of $15,000 monthly was anticipated for data storage with EBS volumes, but upon switching to S3 Express, storage costs dropped to $3,000. In terms of scalability, the observability data that was sitting idle was expanded to multiple terabytes and incorporated security data from AWS, Tanium, Azure, and CrowdStrike, scaling up to multiple petabytes. This solution led to high performance, prompting the ClickHouse team to invite this organization to share its performance at their annual conference.

What needs improvement?

ClickHouse could be improved concerning data insertion, especially given the high amount of data handled. Constant efforts are made to optimize the features on its own, but with merges and inserts, only a single insert query can be performed allowing for the input of only 100,000 rows per second. It would be beneficial to insert more data and have configurations that are less user-operated. Ideally, ClickHouse would optimize itself to handle these processes automatically, reducing the need to contact the ClickHouse support team for infrastructure optimization.

Additionally, delays are experienced when trying to delete databases with corrupt data, taking too much time and causing major outages, which necessitate contacting multiple teams across continents for resolution. The community surrounding ClickHouse also seems limited, providing a reliance on documentation, and there is a scarcity of developers working with ClickHouse, which hinders growth. If ClickHouse were more user-friendly and technically feasible, it would likely see greater expansion in usage.

For how long have I used the solution?

More than four years have been spent working in the current field.

What do I think about the stability of the solution?

ClickHouse has been stable and reliable for most workloads.

What do I think about the scalability of the solution?

ClickHouse has handled growth and changes in data volume remarkably well.

How are customer service and support?

Overall, customer support has been positive, and the representatives are knowledgeable. However, during major issues, such as the three to four-day outage experienced, the support team was less available, possibly due to tight schedules. If more timely support could be provided during critical issues, situations could have been resolved much more quickly, saving considerable time.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

ClickHouse was the first solution used; the product was built from scratch, and ClickHouse was started from the first stage. No other solution was considered.

How was the initial setup?

ClickHouse is deployed in a private cloud setup. An AWS VPC is configured with multiple AWS nodes, comprising two environments: dev with a 12-node cluster and prod with an 18-node cluster. Each node is equipped with 128 GB RAM and 128 GB of SSD, backed by S3 for data storage, and comes with a CHProxy layer of four nodes in both environments to handle read requests. Write requests are handled directly through an IP basis, and AWS Route 53 and other DNS services have been incorporated for the VPC communication.

Which other solutions did I evaluate?

Before choosing ClickHouse, Elastic was evaluated for observability data. However, since the organization was previously using Elastic and scrapping it due to higher costs with the cloud version, ClickHouse was opted for instead.

What other advice do I have?

This interview could be improved by not asking a single question multiple times. The same question was asked in different ways, and it is recommended to avoid stressing the interviewer or interviewee by not repeating questions. If the answer provider has already given an answer, there is no need to ask for more details again.

My advice to others considering ClickHouse is to opt for the cloud version instead of the on-premises version if budget permits. This decision saves considerable time on managing infrastructure and allows more focus on application and product development. This review has been rated with a score of six out of ten.

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)


    reviewer2403399

Query engine is super fast but improvement needed in integration to third-party applications or the cloud

  • May 21, 2024
  • Review provided by PeerSpot

What is our primary use case?

Our use cases are for data analytics, both real-time and batch, and also for logging Clickstream data.

We use it in our organization. We have it in our production environment.

What is most valuable?

The query engine is super fast. We deploy ClickHouse on our Kubernetes cluster, not as a cloud subscription, so it's easy to scale with the deployment.

What needs improvement?

Some features, like connecting to third-party applications or the cloud, could be better.

For how long have I used the solution?

I have been using it for one year.

What do I think about the stability of the solution?

One issue is that you need persistent volumes. Otherwise, if one system goes down, you lose data in that cluster.

Another issue is performance. You have to make sure you have the right configurations; otherwise, it will lead to queuing where all your jobs get queued.

What do I think about the scalability of the solution?

It is a scalable product.

How are customer service and support?

You only get technical support when you take the cloud subscription. If you have it in-house, you won't get any support. If you have a cloud subscription, then the support is pretty good. You can raise a ticket from the UI, and they will respond within 24 hours.

So, the support team is pretty good but there is a little room for improvement.

How would you rate customer service and support?

Neutral

How was the initial setup?

The initial setup is pretty difficult since we deployed it in-house. We didn't use the cloud subscription, so we have to handle the deployment very carefully.

The challenge was deploying it and having the replication concept working. Another challenging feature is persistent volumes. You have to make sure the data is available on all clusters; otherwise, if one cluster goes down, you'll lose all your data. It's better to have it replicated.

We first used the cloud subscription, but we saw a possibility to reduce costs, so we tried deploying the open-source ClickHouse on-premises. That saved us money, but we didn't get all the features that come with the subscription.

What about the implementation team?

We did it in-house.

What's my experience with pricing, setup cost, and licensing?

Pricing for the cloud version is alright, not very costly or cheap.

But if you have an in-house deployment on Kubernetes or something, it's going to be very cheap since you'll be managing everything.

What other advice do I have?

I would tell other users to do a POC because it depends upon the business use case and the data. They can explore first. There's another open-source option called Apache Druid, which is a little better than ClickHouse. If that doesn't fit the use case, then they could go for ClickHouse.

Overall, I would rate the solution a seven out of ten.

If you have a real-time basis, you should take a look at ClickHouse because it works on a vector database, and the querying is super fast compared to traditional databases. So, if your use case is real-time or logging or real-time dashboarding, then ClickHouse is a tool to consider. Otherwise, if it's batch processing and you can expect some latency, then you should go for other databases.


showing 1 - 3