Sold by

Databricks Data Intelligence Platform
The Databricks Data Intelligence Platform unlocks the power of data and AI for your entire organization. Enjoy up to $400 in usage credits during your 14-day free trial. Cancel anytime. After your trial ends, you will automatically be enrolled into a Databricks pay-as-you-go plan.
Reviews (824)
Siddharth V.
Seamless Data Visualization and Storage with Databricks
Reviewed on Jun 07, 2026
Review provided by G2
What do you like best about the product?
I really love that Databricks has a UI that is essentially very simple to understand, and the categorizations of data make it easy to find and manage repositories. It's also very easy to set up jobs right on the fly without writing extensive scripts, which is a really good functionality. The native visualizations on Databricks allow me to uncover a lot of insights and make business-driven decisions. Additionally, the role-based access is very seamless, and the functionality provided by Databricks makes it very valuable. The native notebooks feature is also very, very valuable. Overall, with the amount of functionality it has, using Databricks is a buy.
What do you dislike about the product?
Maybe the multi-select cursor functionalities, which I initially had in open source Redash, might be very useful for productivity. It's a minor kind of functionality. Other than that, Databricks is really useful, and not much changes which I would recommend.
What problems is the product solving and how is that benefiting you?
I use Databricks for usage analytics, understanding data storage, uncovering insights, visualizing data, and making business-driven decisions.
Gunther C.
Databricks Makes Large-Scale Data Transformations Easy to Run
Reviewed on Jun 05, 2026
Review provided by G2
What do you like best about the product?
Databricks simplifies the process of running data transformation operations on massive datasets. Although it can be a bit of a paradigm shift from classic asynchronous processing architectures, it is extremely easy to get started with. Simply put, the thing I like best about it is it's ability to do work at scale.
What do you dislike about the product?
The inability to run a copy of Databricks locally to test changes before deploying them to production is a significant hindrance. Creating per-developer staging environments might be a close solution l, but might be a lot of work to manage.
What problems is the product solving and how is that benefiting you?
Databricks is making it possible to process tremendous amounts of data efficiently, while simultaneously not requiring a large amount of engineering effort to be applied towards designing the system itself (engineers can focus on solving data problems rather than scaling problems)
Krupa P.
Very powerful tool with Spark and big data.
Reviewed on Jun 04, 2026
Review provided by G2
What do you like best about the product?
In fact, the most valuable thing about Databricks is that you do not require worrying about looking after the Spark infrastructure. Previously, it took us so much time to configure clusters manually and here, in a few clicks, you can spin up a cluster.
The collaborative notebooks are also very much helpful. My teammates and I are able to collaborate in the same notebook and write Python or Scala or SQL in the same location and share the output in a short time. The connection to AWS and Git is also very fluid, and thus pushing code to production is not demanding a lot of effort at the moment.
The collaborative notebooks are also very much helpful. My teammates and I are able to collaborate in the same notebook and write Python or Scala or SQL in the same location and share the output in a short time. The connection to AWS and Git is also very fluid, and thus pushing code to production is not demanding a lot of effort at the moment.
What do you dislike about the product?
The most significant issue of mine is the cluster start time. There are also cases that I would simply need to make a minor change in the code and the cluster can take about 5-7 minutes to spin up a cold start. It actually disrupts the development.
What problems is the product solving and how is that benefiting you?
We are deploying Databricks to create our ETL pipelines and process large volumes of customer data every day.
Our local machines would crash prior to using Databricks due to memory problems with large datasets. At this point, we simply push it all to the Databricks cloud. It has completely addressed our scaling problems. Also, job scheduling is extremely simple here, even simple daily pipelines do not require orchestrators such as Airflow. It saves us much time in maintenance.
Our local machines would crash prior to using Databricks due to memory problems with large datasets. At this point, we simply push it all to the Databricks cloud. It has completely addressed our scaling problems. Also, job scheduling is extremely simple here, even simple daily pipelines do not require orchestrators such as Airflow. It saves us much time in maintenance.
Aruna P.
This is very powerful for big data and machine learning but watch the cluster costs!
Reviewed on Jun 04, 2026
Review provided by G2
What do you like best about the product?
The best thing is that we don't have to do any infrastructure to manage now. My team was spending too much time on setting up Apache Spark cluster, managing yarn, and memory crashes on-premise before. With Databricks, we could – within 2-3 clicks – spin up a cluster; collaborative notebooks are very nice! Data engineers and data scientists share the same notebook, so they can collaborate on the same notebook, at the same time. We can have Python, Scala and SQL together in one place without changing any environments. Another super solid feature is delta lake; those provide us with transactions over raw data, this saved us from a lot of data corruption issues since in the past.
What do you dislike about the product?
Frankly, its cost is quite high. They charge for DBUs (Databricks Units) and then the cloud provider charge (we're using AWS). Unless you are monitoring, the bill will shoot like a rocket. At times, my developers forget to shutdown the clusters and, if auto-termination is not configured correctly, it's running all night and we get an interest big wake up call in the billing portal the next morning.
What problems is the product solving and how is that benefiting you?
We had our data in a lot of different places prior to Databricks. The marketing data was somewhere else, transactional database was somewhere else. We may have had a lot of problems with silo thinking. We are porting Databricks to implement our Lakehouse architecture.We are deploying Databricks to create our Lakehouse Architecture. Now, all raw data will be transferred to S3 and will be cleaned, processed and BI Reporting will be done on Databricks. Went a long way to help resolve our speed issue. Our daily ETLs used to run from 6 - 8 hours. Today, the same pipelines are completing within 45 minutes or less, thanks to Spark optimization in Databricks. The reporting of my business team is providing on time in the morning; thus, the decision making is very fast.
Sachin G.
Eliminates the fragmentation tax for ML teams, but Unity Catalog migration takes patience
Reviewed on Jun 03, 2026
Review provided by G2
What do you like best about the product?
Managing end-to-end machine learning pipelines, specifically training and deploying multi-agent models and recommendation engines.What I appreciate most about Databricks is how it completely eliminates the coordination overhead—the fragmentation tax—between our data engineering and data science teams. Before Databricks, we were losing hours every day moving data between unmanaged data lakes, proprietary data warehouses, and our isolated machine learning compute clusters. Having MLflow natively managed inside the Databricks workspace is a massive advantage for my day-to-day workflow. I no longer have to worry about setting up tracking servers or maintaining infrastructure just to log my training metrics, because Databricks handles the automatic updates and maintenance seamlessly. Every experiment is automatically tracked, and the model registry seamlessly handles version control, making the handoff from experimentation to production deployment incredibly smooth. Additionally, the recent updates to MLflow for evaluating GenAI agents, specifically the ability to use trace-derived baselines to generate runnable evaluation scripts, have saved me countless hours of manual assembly.
What do you dislike about the product?
The transition to Unity Catalog has been a significant hurdle for our team. Upgrading our legacy workspace to support Unity Catalog's centralized access control and lineage tracking involved a steep learning curve, especially when dealing with privilege inheritance and ensuring the correct schema privileges were granted across the board. Furthermore, while the platform beautifully abstracts away a lot of DevOps work, it can obscure underlying infrastructure costs. It is far too easy for an engineer to spin up an oversized compute cluster for a simple exploratory data analysis task, leading to sudden and severe spikes in our monthly cloud bill. You have to be extremely disciplined with setting strict auto-termination policies and cluster management rules to keep costs in check. The user interface can also feel a bit tedious at times, requiring you to click through multiple layers in the Catalog Explorer just to view the model details page and trace table-to-model lineage.
What problems is the product solving and how is that benefiting you?
The primary problem Databricks solved for us was the massive bottleneck in deploying machine learning models to production. We used to struggle with the classic issue where a model worked perfectly in a local notebook but failed in production due to environment mismatches and a lack of proper version control. By standardizing on Databricks and the managed MLflow environment, we established a strict, documented approval chain that satisfies both our engineering standards and our strict compliance requirements. A real-life example of this was when we recently deployed a multi-agent system for customer churn prevention. We were able to run the inference, monitor the agent's safety and relevance metrics using MLflow's built-in judges, and continuously track the outputs all in one unified platform. This consolidated architecture cut our deployment timelines drastically and significantly reduced the time we spent debugging production errors.
Jatin P.
Unified AI and Data Engineering Platform with Smooth Cost Control
Reviewed on Jun 03, 2026
Review provided by G2
What do you like best about the product?
I appreciate how Databricks brings together data engineering, analytics, and machine learning processes in a single, governed workspace. The data reliability features like automatic versioning, transaction support, and quality controls are great for maintaining consistency and audit readiness without extra manual effort. For AI-related work, I find the experiment tracking, model deployment, and governance capabilities helpful for scaling efforts securely while meeting compliance standards. I also like the cost monitoring and cluster data management tools, which provide better visibility and help control expenses as usage grows across departments. The detailed breakdowns by job, cluster, user, and workload type, along with budget and alerts, are particularly useful.
What do you dislike about the product?
There is a learning curve when first adopting Databricks, especially for teams transitioning from traditional setups. The initial setup was a little difficult for these teams.
What problems is the product solving and how is that benefiting you?
Databricks unifies data engineering, analytics, and machine learning in one workspace, boosting data reliability. It helps scale AI efforts securely, while cost monitoring tools provide visibility and control as usage expands.
Anita P.
Unified Scalable Data Processing and Machine Learning Platform
Reviewed on Jun 03, 2026
Review provided by G2
What do you like best about the product?
As a Data Scientist working for a mid-size company, my main use case for Databricks is as the central engine for all of our data processing and predictive modeling pipeline. I use it every day to pull raw dirty data from our cloud storage, explore it with complicated SQL queries and then create and train machine learning models with PySpark and Python. Basically it gives our data engineering and data science teams a common place to play on the same huge data sets at the same time without having to endlessly exchange files or credentials.From a day-to-day workflow perspective, I love the fluidity of the collaborative notebook environment. The ability to work with different languages in the same workplace is a great advantage. I can perform an optimized SQL query to pull in a hefty data set in one cell, then process it in the next using PySpark, and visualize it with Python libraries straight after. This fully removes the need to constantly bounce between different tools or IDEs. Another big victory for my daily work is the out-of-the-box connection with MLflow. It makes it very easy to roll back to a previous version, automatically tracks hyperparameter tuning, compares several model runs, and manages the full lifespan of a model. I really enjoy how Databricks takes away the effort of managing Spark clusters, you can spin up a distributed cluster with a few clicks, and focus on writing algorithms vs playing DevOps.
What do you dislike about the product?
And despite all its potential, working with Databricks does come with certain daily difficulties. What is most important for a mid-sized company like us is the aggressive pricing model for compute costs. The monthly payment can get out of control very rapidly, if you’re not compulsively watching your cluster configurations and auto-termination settings especially if a high-memory cluster is unintentionally left operating over the weekend. Another major pain point is the built-in Git integration. Databricks Repos has been helpful however managing complicated merge conflicts or branch management still feels unexpectedly clumsy compared to a regular local IDE like VS Code. Lastly, the learning curve is rather severe for new employees. The user interface might be complicated and debugging distributed computing failures can be a major bottleneck for young data scientists getting up to speed.
What problems is the product solving and how is that benefiting you?
The largest basic problem that Databricks tackled for our business was breaking down the separate silos between our data engineers and data science team. We saw this effect in the real world recently when we were working on a project to build a fraud detection algorithm. In the prior approach, I would have to submit a ticket to data engineering, wait days for them to extract and clean the data, and then try to train the model locally. I would get out of date data by the time I got it, and my machine would crash all the time owing to memory constraints. I could immediately connect to our Delta Lake, utilize PySpark to process the huge data size without any memory issues and train the model on a scalable cluster, all in the same ecosystem using Databricks. This one-stop-shop decreased our model deployment duration from about a month to a couple of days, dramatically enhancing how fast we offer actionable business value.
Keshav R.
Excellent for big data team but very tricky to manage costs and access
Reviewed on Jun 03, 2026
Review provided by G2
What do you like best about the product?
The main benefit from IT side is that Databricks removes the infrastructure headache. Earlier our data engineers were always asking for setting up Spark clusters, managing libraries, and handling VM failures. Databricks does all this automatically. The auto-scaling is quite smooth; it adds nodes when workload is high and removes them later, so infrastructure utilization is very efficient.
Also, the integration with AWS and Azure IAM roles is very solid. We can easily connect it with our active directory for single sign-on SSO, which makes user onboarding very fast. The notebook sharing feature is also liked by my teams because they can collaborate without sharing code files over email or Slack.
Also, the integration with AWS and Azure IAM roles is very solid. We can easily connect it with our active directory for single sign-on SSO, which makes user onboarding very fast. The notebook sharing feature is also liked by my teams because they can collaborate without sharing code files over email or Slack.
What do you dislike about the product?
The biggest pain point for IT Operations is cost control. Databricks billing uses DBUs Databricks Units, and it is very difficult to predict monthly budget. Another issue is the cluster startup time. It takes around 4 to 7 minutes to spin up a new cluster.
What problems is the product solving and how is that benefiting you?
We are using Databricks to centralize our entire data processing and machine learning pipelines. Before this, data was scattered in different silos, and maintaining different environments for data engineers and data scientists was an operational nightmare.
Now, Databricks gives a single platform. From an operations perspective, it reduces my team's support ticket load by at least 40% because users can self-serve their clusters within the limits we set. It saves a lot of engineering hours that we used to spend on maintaining open-source Apache Spark infrastructure.
Now, Databricks gives a single platform. From an operations perspective, it reduces my team's support ticket load by at least 40% because users can self-serve their clusters within the limits we set. It saves a lot of engineering hours that we used to spend on maintaining open-source Apache Spark infrastructure.
Ranjit P.
Managed Spark Clusters and Collaborative Notebooks That Just Work
Reviewed on Jun 03, 2026
Review provided by G2
What do you like best about the product?
The best thing about Databricks is the managed Spark clusters. Earlier, setting up Apache Spark manually on AWS or Azure was a big headache. Now, with Databricks, I can spin up a cluster with just a few clicks. The auto-scaling feature works very well, when processing heavy data workloads, it automatically adds nodes and reduces them when done, which saves some cloud costs.
Also, the collaborative notebooks are amazing. My team members and I can work on the same Python or SQL code at the same time, just like Google Docs. The integration with Delta Lake is also a big plus because it gives ACID transactions directly on cloud storage, so data corruption issues are very rare now.
Also, the collaborative notebooks are amazing. My team members and I can work on the same Python or SQL code at the same time, just like Google Docs. The integration with Delta Lake is also a big plus because it gives ACID transactions directly on cloud storage, so data corruption issues are very rare now.
What do you dislike about the product?
The biggest issue is the pricing. Databricks DBUs Databricks Units are quite expensive, and if you are not careful with cluster configurations or leave a cluster running by mistake, the cloud bill will jump very high quickly. The cost management tools inside the platform could be much better.
What problems is the product solving and how is that benefiting you?
We are solving the big problem of data silo and slow ETL Extract, Transform, Load pipelines. Before Databricks, our data science team and data engineering team were working in different environments, and moving data between them was painful.
Now, Databricks acts as a single Unified Analytics Platform. We ingest raw data into Azure/AWS, clean it using Spark SQL, and the machine learning guys use the same platform to train models. It has reduced our data processing time from hours to minutes, which helps us deliver client projects much faster.
Now, Databricks acts as a single Unified Analytics Platform. We ingest raw data into Azure/AWS, clean it using Spark SQL, and the machine learning guys use the same platform to train models. It has reduced our data processing time from hours to minutes, which helps us deliver client projects much faster.
ibrahim d.
Databricks: Unified, Efficient at Scale with Seamless Cloud Integration
Reviewed on Jun 03, 2026
Review provided by G2
What do you like best about the product?
Databricks provides a unified platform and is very efficient working with large scale terabytes level data. I also like the integration with various cloud services which is seamless and very helpful. Also, the inbuilt Apache spark and very efficient AI/ML workflow orchestration stands out from others. And the databricks support has been outstanding in case of any issues.
What do you dislike about the product?
With features comes cost and using databricks at a scale we use it (terrabytes data, multi customer, multi environment) becomes cost challenging. Also, learning curve can be bit steep for new beginners.
What problems is the product solving and how is that benefiting you?
Our primary challenge was managing large volume data for multiple customers and across different regions. Databricks very efficiently resolved that challenge with it unified platform and very good cloud integration. Our data pipelines are much faster and more orchestrated than ever.