Databricks Data Intelligence Platform
Databricks, Inc.External reviews
792 reviews
from
and
External reviews are not included in the AWS star rating for the product.
Databricks Unifies Data, Analytics, and ML for Scalable Lakehouse Workflows
What do you like best about the product?
Databricks is especially helpful because it brings data engineering, analytics, and machine learning together in a single unified platform, which reduces the need to manage multiple separate tools. Built on Apache Spark, it can process massive datasets quickly and scale smoothly as workloads grow, making it a strong fit for big data use cases. It also supports collaborative notebooks where teams can work together in languages like Python and SQL, which makes it easier for data scientists and engineers to collaborate effectively.
With its lakehouse architecture powered by Delta Lake, Databricks combines the flexibility of data lakes with the reliability of data warehouses, helping ensure better data consistency and performance. In addition, it integrates with tools like MLflow to streamline the machine learning lifecycle end to end, from experimentation through deployment. Overall, Databricks simplifies complex data workflows, improves performance, and helps organizations build scalable data and AI solutions more efficiently.
With its lakehouse architecture powered by Delta Lake, Databricks combines the flexibility of data lakes with the reliability of data warehouses, helping ensure better data consistency and performance. In addition, it integrates with tools like MLflow to streamline the machine learning lifecycle end to end, from experimentation through deployment. Overall, Databricks simplifies complex data workflows, improves performance, and helps organizations build scalable data and AI solutions more efficiently.
What do you dislike about the product?
Databricks does have some limitations, although many of them feel more like trade-offs than outright negatives. A frequently cited drawback is cost: while the platform is flexible and scalable, expenses can rise quickly if clusters aren’t managed carefully. At the same time, that cost often reflects its ability to handle very large workloads efficiently when it’s properly optimized.
Another consideration is the learning curve, especially for beginners who aren’t familiar with Apache Spark or distributed systems. That complexity can be challenging at first, but it also comes with the benefit of powerful capabilities once you get comfortable with it. Some users also find that debugging and performance tuning are less straightforward than with simpler tools; however, Databricks offers detailed monitoring and optimization features that can make these tasks easier over time.
Finally, because it’s a managed platform, there can be a sense of reduced control compared with fully self-managed systems. In return, it removes much of the operational burden that comes with infrastructure management. Overall, while these areas may be seen as the “least helpful” aspects, they’re often balanced by the platform’s scalability, integration, and productivity gains.
Another consideration is the learning curve, especially for beginners who aren’t familiar with Apache Spark or distributed systems. That complexity can be challenging at first, but it also comes with the benefit of powerful capabilities once you get comfortable with it. Some users also find that debugging and performance tuning are less straightforward than with simpler tools; however, Databricks offers detailed monitoring and optimization features that can make these tasks easier over time.
Finally, because it’s a managed platform, there can be a sense of reduced control compared with fully self-managed systems. In return, it removes much of the operational burden that comes with infrastructure management. Overall, while these areas may be seen as the “least helpful” aspects, they’re often balanced by the platform’s scalability, integration, and productivity gains.
What problems is the product solving and how is that benefiting you?
Databricks helps solve the challenge of fragmented data and disconnected workflows across multiple business verticals by providing a unified lakehouse platform. In my role as a data engineer, this allows me to consolidate data from different sources into a single, reliable system using Apache Spark for scalable processing and Delta Lake for ensuring data quality and consistency. This significantly reduces pipeline complexity, improves reliability, and enables faster delivery of clean, governed data to downstream teams. As a result, I’m able to support analytics and machine learning use cases more efficiently while minimizing operational overhead and improving overall productivity across the organization.
Great Infrastructure for Reliable Data Management
What do you like best about the product?
They have really great and initiative infrastructure that gives all that we need for our data management.
What do you dislike about the product?
All the techy issues that we are having we consult with their support team and got solution for it.
What problems is the product solving and how is that benefiting you?
Overall the reliability and features that Databricks providing is the most helpful and privious for us that make work more easy and hassle free.
Unified ML Platform That Removes Infrastructure Friction
What do you like best about the product?
The unified platform experience is genuinely hard to beat — having MLflow for experiment tracking, Unity Catalog for governance, vector search, and serverless endpoints all in one place removes so much infrastructure friction. Feature engineering pipelines and model deployment feel cohesive rather than stitched together. The SQL warehouse + notebook hybrid workflow also makes it easy to hand off between data engineering and ML work without context switching tools.
What do you dislike about the product?
Serverless endpoints have some sharp edges — Spark context initialization behaves differently than in interactive clusters, which can cause silent failures if you're not careful about where you initialize things. Cold start latency on serverless is also noticeable for low-traffic production endpoints. Documentation around some of the newer features (like vector search index configs) tends to lag behind the actual product behavior, so you end up doing a lot of trial and error.
What problems is the product solving and how is that benefiting you?
We use Databricks to consolidate ML model development, feature engineering, and deployment for a cards and payments platform — work that previously required juggling separate tools for data processing, training, and serving. The unified environment means our ML engineers can go from raw transaction data to a deployed churn prediction model without leaving the platform. MLflow tracking keeps experiments reproducible, and Unity Catalog gives us the data governance story our banking client needs. It's cut down a significant amount of the coordination overhead that comes with multi-tool ML pipelines.
Unifies Data Processing with Delta Lake's Reliability
What do you like best about the product?
I use Databricks in my enterprise environment and projects to ingest data from multiple sources, transform and clean it at scale, and prepare reliable datasets for analytics and reporting. It allows me to build and manage data pipelines efficiently using Spark, SQL, and notebooks. I love having data ingestion, large-scale processing, analytics, and collaboration all in one place, making my workflow much more streamlined and efficient. I really value the reliability and confidence I get from features like Delta Lake, which make data versioning, recovery, and handling changes much safer, cheaper, and easier in my projects. Delta Lake is one of the main reasons Databricks is so valuable to me because it directly addresses reliability and trust, which are constant challenges in real data projects. The ability to rollback to a previous version if something goes wrong makes me much more confident when developing, testing, or deploying changes to production pipelines. Additionally, the initial setup was relatively straightforward because Databricks integrates well with our existing cloud infrastructure.
What do you dislike about the product?
The learning curve can be quite steep at the beginning, especially for users who are new to Spark or large scale data processing concepts. Debugging complex pipelines or job failures can sometimes be time-consuming, when error messages are not very intuitive. As workflows and environments grow, governance and environment management can require extra effort to keep everything well-organized and consistent. Cost management is another challenge, as resource usage can increase quickly if clusters and jobs are not configured or monitored carefully.
What problems is the product solving and how is that benefiting you?
I use Databricks to solve fragmentation and inefficiency in my data flow, handling ingestion, transformation, analytics, and collaboration on one platform. It reduces operational overhead, ensures data quality, and offers scalability, improving large data processing without infrastructure worries.
Databricks Genie and AgentBricks Make “Talk to Data” Easy
What do you like best about the product?
In databricks I like genie and agentbricks that help me to solve business process as talk to data
What do you dislike about the product?
I think all the functionality works as expected for me.
What problems is the product solving and how is that benefiting you?
It’s mainly about giving my business users more flexibility to talk directly to the data and run their own analysis without needing to know SQL.
Databricks Saves Time with Smooth, High-Performance Data Pipelines
What do you like best about the product?
Databricks saves time by automating data pipelines, improving performance, and reducing infrastructure management.
Overall, it provides a smooth experience for building, analyzing, and deploying data solutions.
Overall, it provides a smooth experience for building, analyzing, and deploying data solutions.
What do you dislike about the product?
Databricks provides strong capabilities for large‑scale data processing and collaboration, but there are areas for improvement.
What problems is the product solving and how is that benefiting you?
We use Databricks for building and managing large‑scale data pipelines and analytics workloads.
It helps us process high‑volume data faster by using scalable Spark clusters and automated workflows.
It helps us process high‑volume data faster by using scalable Spark clusters and automated workflows.
Databricks’ Unified Platform: Fast SQL, Streamlined Pipelines, and Context-Aware AI
What do you like best about the product?
The unified platform experience is what keeps me on Databricks. Having notebooks, pipelines, SQL warehouses, ML, and governance all in one place under Unity Catalog means I’m not constantly stitching together five different tools just to get work done.
Lakeflow Pipelines (formerly DLT) makes it straightforward to build medallion-architecture pipelines, and the Photon engine delivers real performance gains on SQL workloads without requiring any code changes. Recent additions like Genie Code and background agents also show they’re serious about agentic AI—it doesn’t feel like a bolt-on copilot, because it can actually understand your data context through Unity Catalog. Serverless compute has been another big quality-of-life improvement as well, since I no longer have to wait for cluster spin-up when I just want to run quick, ad hoc queries.
Lakeflow Pipelines (formerly DLT) makes it straightforward to build medallion-architecture pipelines, and the Photon engine delivers real performance gains on SQL workloads without requiring any code changes. Recent additions like Genie Code and background agents also show they’re serious about agentic AI—it doesn’t feel like a bolt-on copilot, because it can actually understand your data context through Unity Catalog. Serverless compute has been another big quality-of-life improvement as well, since I no longer have to wait for cluster spin-up when I just want to run quick, ad hoc queries.
What do you dislike about the product?
Cost management can be tricky—DBUs add up quickly if you’re not careful with cluster sizing and warehouse auto-scaling. The pricing model also isn’t always transparent, especially when you’re mixing serverless and classic compute.
Unity Catalog is powerful, but the initial setup and the migration from legacy HMS can be painful, particularly for large orgs with years of existing Hive metastore objects. The documentation is generally good, yet it sometimes lags behind new feature releases. On top of that, the workspace UI can feel sluggish at times, especially when you’re working with a large number of assets.
Unity Catalog is powerful, but the initial setup and the migration from legacy HMS can be painful, particularly for large orgs with years of existing Hive metastore objects. The documentation is generally good, yet it sometimes lags behind new feature releases. On top of that, the workspace UI can feel sluggish at times, especially when you’re working with a large number of assets.
What problems is the product solving and how is that benefiting you?
Before Databricks, our data stack was fragmented — separate tools for ETL, analytics, ML, and governance. That meant constant context-switching, duplicated data, and governance gaps. Databricks consolidates all of that into one lakehouse platform. Delta Lake gives us reliable ACID transactions on the data lake, Unity Catalog handles lineage and access control across the board, and SQL warehouses let our analysts self-serve without needing a separate data warehouse product. It's cut our pipeline development time significantly and made data governance something we can actually enforce consistently instead of hoping for the best.
Databricks Genie A/BI and Genie Code: Amazing Features on My Favorite Platform
What do you like best about the product?
I think almost all the features, being MVP Databricks, are always my favourite platform. If one object I pick, then its Databricks Genie A/BI and Genie Code, so its genie ...... genie.... really amazing name and amazing feature.
What do you dislike about the product?
The Databricks team is not hiring me. I am one of the great, great fans of databricks.
What problems is the product solving and how is that benefiting you?
More visibility, making things easier and easier, data access is not a challenge for anyone.
Databricks Unifies Data Engineering, Science, and Analytics Exceptionally Well
What do you like best about the product?
The ability to converge data engineering, data science, and analytics on a single platform without compromising on governance, performance, or flexibility is still rare in the industry. Databricks executes this exceptionally well.
What do you dislike about the product?
Reducing the spinning time of all purpose clusters and job clusters. It would be more usefula nd helpful if it starts as quick as serverless
What problems is the product solving and how is that benefiting you?
In enterprise banking, where regulatory compliance, data accuracy, and operational resilience are non-negotiable, Databricks is solving some of our most critical challenges. As a Lead Data Engineer managing end-to-end ETL pipelines, dashboard delivery, monitoring alerts, and data governance for a major banking client, the platform has become the backbone of our modern data architecture. Databricks unifies our fragmented data landscape through Delta Lake and Unity Catalog, giving us ACID-compliant transactions for reliable ETL, automated lineage for audit-ready governance, and fine-grained access controls to protect sensitive PII and financial data—all while enabling seamless schema evolution to handle the constant changes in source systems. This directly translates to faster, more trustworthy reporting: our dashboards in Power BI and Tableau now pull from a single source of truth, eliminating metric disputes between Risk, Finance, and Compliance teams. On the operational side, native alerting integrated with Slack and PagerDuty, combined with Databricks System Tables for observability, lets us proactively catch data quality issues or SLA breaches before they impact business decisions—reducing incident resolution time by over 60%. Ultimately, Databricks isn't just improving our engineering efficiency; it's enabling us to innovate responsibly in a highly regulated environment, delivering trusted insights at scale while keeping auditors confident and stakeholders aligned.
Versatile Platform, But Needs Faster Analysis
What do you like best about the product?
I like Databricks because it allows me to perform multiple tasks on a single platform, which isn't possible with some other cloud service platforms. This functionality is particularly useful for managing database tasks efficiently and is a capability I can't find in other platforms.
What do you dislike about the product?
I have faced many times when there's a wrong thing in Databricks, and it takes some time to analyze. It could be better if they gave faster and more accurate answers.
What problems is the product solving and how is that benefiting you?
I use Databricks for migration projects. It allows me to perform multiple tasks on a single platform, which I can't do on other cloud platforms.
showing 41 - 50