Provides cost efficiency and flexible control over clusters
What is our primary use case?
We are serving the Ads business for one of the biggest startups in India. Our use case involves capturing events or impressions generated on our app and performing aggregations of the users. If some enrichment is required, we take care of the enrichment.
We use this data for dashboarding purposes. The same dataset is also used for advanced analytics by our data science team. We also use this data for data heuristic algorithms based on business rules. Decisions are made like which banner to serve to a customer based on user relevance and monetization.
How has it helped my organization?
Our use cases are very much in line with what Apache Druid is all about, which is handling time series data and querying time series data. It makes the queries more performant by going the denormalized way. All the predicate clauses are on time-bound windows. It implies the data model and has data rollups, Theta Sketches, and other things. That helps with the use case and query performance. We have seen more than five times improvement in QPS.
We were initially using BigQuery. Cost-wise, we have been able to reduce our ETL cost because of a change in our data model style. We have had around 30% cost savings.
What is most valuable?
One of the best parts of the solution is the Hybrid model that allows flexibility to keep control over the clusters. In our company, another team has already been using the Imply license. They also have an event-driven architecture but at an even larger scale. The Hybrid model gives us the flexibility to keep control over the clusters wherein we have our data, query nodes, and other things in our own AWS account. Our data and query nodes are kept in our own AWS account.
The managed part of Imply allows control through a control panel where infrastructure can be monitored and controlled. It provides the option to turn off non-essential clusters. For example, if you do not want to run the staging cluster at all times because you are not doing development all the time, you can just turn it off. It saves costs and allows us to upscale or downscale data and query nodes in the cluster.
What needs improvement?
The managed offering has two models: Polaris and Hybrid. We explored both during the PoC phase. The Hybrid model gives you the flexibility to keep your data safe on your own site but still have a managed service to control your infrastructure. The Polaris model, on the other hand, does not give you an insight into what kind of AWS box you are using. Based on your capacity planning, you can just choose the correct size of the box. It also gives you a dashboard.
I would like Imply to include more flexible billing models with added options for superior infrastructure control, flexibility in scaling, and cost-effectiveness, such as choosing the number of CPUs required. We should have more flexibility and control over the infrastructure in terms of upscaling and downscaling. Currently, there are only certain tightly bound options. With more flexible options, more customers will adopt the solution.
For how long have I used the solution?
We have been using Imply in production for close to six months.
What do I think about the stability of the solution?
We have not faced any downtime. It has been a stable solution so far. I would rate its stability as a nine out of ten.
What do I think about the scalability of the solution?
Imply is very scalable. The Hybrid model allows easy scaling of data nodes through the control panel. However, meticulous planning is needed while buying the license because the license will be there for three years or so. You need to keep some buffer from the growth perspective. It is easily scalable if you have done capacity management properly and chosen the right amount at the start of the agreement. In the Hybrid model, there is also flexibility to choose the box that you prefer. If I want to build my data nodes on Graviton, I have the flexibility to do that.
It is being used by multiple departments but at only one location. I work for an organization based out of India. Because of compliance, we are keeping all our systems in the Mumbai location.
Overall, at the organization scale, there are more than 100 users. About 20 of them are developers, and the rest of them are business users who also use the Clarity View to fire queries on their dataset, select a data source, etc.
How are customer service and support?
Imply provides very good support. We communicate with them through Slack, where solutions architects are always available to assist us. If we are doing any feature development or facing any issues, someone is always there to support us.
They also offer biweekly office hour sessions. I would rate Imply's technical support a nine out of ten.
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
We were using BigQuery for the OLAP solution which we have now shifted to Druid. The main reason for switching was cost efficiency. Another reason was internal to our organization. Paytm is a heavy user of the AWS cloud. We are using only limited services from GCP. We were not getting many credits or good offers from BigQuery. The solution did not have any problem, but it was not cost-effective.
How was the initial setup?
The initial setup was medium in complexity. We planned meticulously, which enabled us to shift our production workflows smoothly. We did everything in a very planned and structured way. While doing the PoC and before that, we had many design sessions within our team. That made us take a call on data modeling and make changes. Because the changes were already discussed and planned, we were able to move our entire production workflow, with 120 workflows, over a period of one and a half months with just two people.
It does not require much maintenance, but there are upgrades. We recently had some box updates. We moved to Graviton systems, so some work was required. There was some security passing through the Imply software. That was easily doable. Other than that, not much is required. The control panel gives you the ease to do everything from the UI itself without touching the machine.
What's my experience with pricing, setup cost, and licensing?
Imply pricing is in the middle range. Understanding the data model can help reduce overall system costs. If you understand the data model clearly and go in-depth while doing your capacity planning, your data modeling can reduce your overload cost. You might be paying 100 dollars to Imply, but your EMR compute cost will also decrease. There is an overall decrease in the system cost. Overall, for both batch and real-time, it is a cost-effective solution.
What other advice do I have?
I would recommend Imply to other users, but it also depends on the use case. You should do a proof of concept and check for QPS. For our use case, it suited well with the P99, P98, and P95 latency. It is suitable for batch OLAP use cases that have time-bound predicate clauses in the dataset.
Overall, I would rate Imply an eight out of ten.
Imply Druid helped us build a fast performant event driven architecture
What do you like best about the product?
We are using Imply's partially managed offering Hybrid where we have the freedom to control and monitor our clusters on our own AWS Infra with a control pane to effectively monitor and upscale/downscale the clusters. Apart from that we are doing computations on timeseries data , data model and features like rollup etc have allowed us to bring done our overall savings to build a batch DWH by 30%. We have dedicated solutions architects who helped us during migration and are available throught our engagement for any technical help.
What do you dislike about the product?
There is nothing as to dislike about the offering and services. However, I would suggest them to comeup with different billing models so that more customers and adopt and use Imply Druid.
What problems is the product solving and how is that benefiting you?
Imply is helping us build our ETL framework for batch workflows which is capturing events from Kafka and then doing certain aggregations and enrichments on top of that. This computed data is then utilized for dashboarding , advanced analaytics i.e. by Data Science team and also fed into certain data heuristic algos for driving customer relevance and serving ads on Paytm platform.
Great product!!!
What do you like best about the product?
Realtime analytics, ease of Use and Integration
What do you dislike about the product?
There is currently nothing specific that I dislike
What problems is the product solving and how is that benefiting you?
Create dashboards quickly and quickly analyzed the data
Simplified version of Apache druid for realtime analytics
What do you like best about the product?
It's a polished version of apache druid with rich features for real time metrics
What do you dislike about the product?
Not an easy setup/infrastructure to onboard and setup
What problems is the product solving and how is that benefiting you?
As business is growing it's important to get real time metrics to identify issues or usage trend. Implly does a fabulous job
Great Cloud Deployment tool
What do you like best about the product?
-Useful in managing service for AWS.
-helps in visualizing large-scale activity streams .
-helps in deploying nodes in a Virtual Private Cloud.
-Allow using SQL queries to build your own applications.
What do you dislike about the product?
I have been using this software for more than a year and i have not found any issue.
One thing is that parallel processing can be optimized.
What problems is the product solving and how is that benefiting you?
-using SQL queries to build your own applications.
managing service for AWS.