Databricks Data Intelligence Platform
Databricks, Inc.External reviews
768 reviews
from
and
External reviews are not included in the AWS star rating for the product.
Unified Databricks Workspace That Streamlines Collaboration and Complex Data Workflows
What do you like best about the product?
What I like best about Databricks is how it brings data engineering, analytics, and machine learning into one unified workspace. I find collaboration much easier with shared notebooks, and the seamless integration with big data tools saves me time. It simplifies complex workflows while still offering powerful capabilities when I need them.
What do you dislike about the product?
One thing I dislike about Databricks is that it can feel expensive, especially for smaller projects or teams. I also find cluster configuration and cost management a bit complex at times. The interface, while powerful, can be overwhelming for beginners, and debugging distributed jobs isn’t always as straightforward as I’d like.
What problems is the product solving and how is that benefiting you?
Databricks solves the challenge of handling large-scale data processing, analytics, and machine learning in one place. For me, it removes the hassle of managing separate tools and infrastructure. I benefit by working more efficiently, collaborating easily with my team, and turning complex data into useful insights faster, with less operational overhead overall.
Efficient Unified Platform for Scalable Data Processing
What do you like best about the product?
I like how Databricks simplifies big data processing with a unified platform for data engineering, analytics, and machine learning. Its seamless integration with Spark and scalability makes handling large datasets much more efficient.
What do you dislike about the product?
The cost can become quite high with heavy usage, especially if clusters aren’t optimized. Also, debugging and monitoring jobs can sometimes feel less intuitive compared to traditional tools.
What problems is the product solving and how is that benefiting you?
Databricks solves the challenge of processing and managing large-scale data efficiently by providing a unified platform for ETL, analytics, and machine learning. It benefits me by simplifying pipeline development, improving performance with Spark, and reducing the need to manage multiple tools.
Powerful Lakehouse Platform with Strong Collaboration
What do you like best about the product?
Databricks is a powerful data lakehouse platform brings data engineering, AI/ML, and SQL analytics together in one collaborative workspace.
What do you dislike about the product?
The downside of Databricks is that it can be costly, especially with frequent cluster usage and poorly optimized workloads
What problems is the product solving and how is that benefiting you?
Databricks helps solve the challenge of working with large volumes of data by bring data engineering, analytics, and AI/ML into one unified platform
Streamlines Data Engineering with Ease
What do you like best about the product?
I really appreciate Databricks for its manageability. The cluster management, unified workspace, optimization, and versioning are all aspects I find incredibly valuable. The console has all the tools readily available, which is super convenient for our large scale data engineering projects. Also, the initial setup was super easy, making it a smooth transition into using the platform.
What do you dislike about the product?
norhing much
What problems is the product solving and how is that benefiting you?
I use Databricks for large scale data analysis, processing, and machine learning. It makes cluster management, workspace unification, optimization, and versioning easy with all tools handy in the console.
Databricks as a Hands On Data Engineer: Solving Real World ETL, Governance, and Lakehouse Challenges
What do you like best about the product?
I believe the most attractive thing about Databricks lies in its all-in-one nature, which makes data management easier. Previously, when I used several tools for data-related activities, the experience was not great but here everything seems to be interconnected and straightforward.
The ability to utilize notebooks, especially when working with PySpark, is another advantage of Databricks that i like the core. The tool allows quickly executing changes and modifications without excessive preparation. It also positively impacts the process of collaboration among my team who can simultaneously work on their projects and monitor the overall progress. However, version control can sometimes appear a bit unclear in my view.
In performance, Databricks seem efficient for me at handling big data and operating smoothly without delays. Cluster scaling occurs automatically, allowing me and my team to save time on the infrastructure level. Therefore,it is easy as no additional planning and adjustments are required.
There are minor issues with the UI, which sometime work slowly. but at overall due to is super other aspects like easy methods in implementing and integrating things it encourages me to utilize Databricks frequently.
The ability to utilize notebooks, especially when working with PySpark, is another advantage of Databricks that i like the core. The tool allows quickly executing changes and modifications without excessive preparation. It also positively impacts the process of collaboration among my team who can simultaneously work on their projects and monitor the overall progress. However, version control can sometimes appear a bit unclear in my view.
In performance, Databricks seem efficient for me at handling big data and operating smoothly without delays. Cluster scaling occurs automatically, allowing me and my team to save time on the infrastructure level. Therefore,it is easy as no additional planning and adjustments are required.
There are minor issues with the UI, which sometime work slowly. but at overall due to is super other aspects like easy methods in implementing and integrating things it encourages me to utilize Databricks frequently.
What do you dislike about the product?
One aspect of Databricks that i dislike is its UI. As you spend longer in using the tool, moving between notebooks and clusters becomes annoying at times.
The other problem is the costs that can faster sum up when we are not cautious. Unnecessary clusters may be running for a longer period than required and without the me or my teams knowledge, thereby increasing up the costs in our projects.
There is also complexity of debugging the errors, which are difficult at times as it involves spending extra effort trying to find out where things might have been wrong mainly when dealing with complex pipelines.
At times, there are some discrepancies with regards to customer service which takes us somewhere where we need not to be.
The other problem is the costs that can faster sum up when we are not cautious. Unnecessary clusters may be running for a longer period than required and without the me or my teams knowledge, thereby increasing up the costs in our projects.
There is also complexity of debugging the errors, which are difficult at times as it involves spending extra effort trying to find out where things might have been wrong mainly when dealing with complex pipelines.
At times, there are some discrepancies with regards to customer service which takes us somewhere where we need not to be.
What problems is the product solving and how is that benefiting you?
The most important issue that Databricks resolves is the issue of working with large volumes of data and maintaining consistency. Previously, there were separate processes for data engineering, analytics, and machine learning operations, requiring separate tools and made it difficult for me to handle but now these all are in one place, another one critical issue solved by Databricks is the issue of processing large data volumes. Utilizing the Spark, and distributed computing allows it to perform the tasks that were extremely slow on legacy systems I worked with. This has helped speed up my pipeline, although some time the delays occur.Collaboration is also another problem that Databricks addresses. Multiple users can collaborate on the same notebook or data sets. Collaboration previously was confusing, and now it is easy and good and easy and easly understandable and mainly easy sharing notebooks and assets.Scalability is another issue resolved by Databricks; there is no need to pay attention to infrastructure management. Cluster scaling depends on user requirements, saving time. Previously, it was necessary to pay more attention to the configuration of the infrastructure.
Reliable data platform with powerful pipeline support
What do you like best about the product?
What I like best about Databricks is how it brings data engineering, analytics, and machine learning together in one clean workspace. It saves time, makes collaboration easier, and helps teams move faster with large data.
What do you dislike about the product?
What I dislike about Databricks is that Auto Loader can become frustrating when source data changes frequently, especially if column names or datatypes shift without warning.
For example, a field like customer_id may suddenly come in as cust_id, or a column that was previously a string may start arriving as an integer, which can cause schema drift and break downstream processing.
I also find it inconvenient when schema inference is not fully accurate, such as when nested JSON or semi-structured data is read incorrectly, because it then requires extra manual fixes and maintenance to keep pipelines running smoothly.
For example, a field like customer_id may suddenly come in as cust_id, or a column that was previously a string may start arriving as an integer, which can cause schema drift and break downstream processing.
I also find it inconvenient when schema inference is not fully accurate, such as when nested JSON or semi-structured data is read incorrectly, because it then requires extra manual fixes and maintenance to keep pipelines running smoothly.
What problems is the product solving and how is that benefiting you?
Databricks is solving the problem of building and managing data pipelines at scale without so much manual effort. It helps with reliable ingestion, schema evolution, and orchestration, so teams can process data faster and keep pipelines more stable even when source files change.
For me, that means less time spent fixing broken jobs and more time focusing on transforming and using the data. It also benefits me by making batch and streaming workflows easier to manage in one platform, which is especially useful when data keeps changing.
For me, that means less time spent fixing broken jobs and more time focusing on transforming and using the data. It also benefits me by making batch and streaming workflows easier to manage in one platform, which is especially useful when data keeps changing.
Databricks: Unified Platform for Data Processing and Analytics
What do you like best about the product?
I like that Databricks brings everything into one place, making it unnecessary to use different tools for data processing, analytics, and pipeline work. It handles large data well, and we don't have to worry about managing clusters manually. Additionally, Databricks handles collaboration and experimentation well, making it easy to try out new things.
What do you dislike about the product?
In my point of view, the one area that can be improved is cost management. If clusters aren't monitored carefully, costs can increase faster than expected. One improvement that would help is better visibility into costs at a more detailed level. More built-in alerts or recommendations when costs start increasing unexpectedly would also be helpful.
What problems is the product solving and how is that benefiting you?
Databricks helps us handle large datasets and build data pipelines. It simplifies data processing, transforming, and analysis using Spark and SQL, all in one place. It solves the problem of slow data processing spread across systems, managing infrastructure automatically and facilitating collaboration and experimentation.
Transforms Table Data into Trustworthy Visuals with Helpful Debugging
What do you like best about the product?
I like the concept of transforming data into visuals for each table. Genie Code also helps with debugging and validating the data, which makes it easier to trust what I’m working with.
What do you dislike about the product?
As a proprietary platform built on open-source foundations, it can still introduce vendor lock-in risks, particularly through components such as Unity Catalog and its custom APIs.
What problems is the product solving and how is that benefiting you?
Databricks primarily solves the longstanding challenges of fragmented data architectures by introducing the Lakehouse paradigm. It combines the low-cost, scalable storage of data lakes with the reliability, ACID transactions, and performance of traditional data warehouses. This eliminates data silos, reduces costly ETL duplication, and provides a single unified platform for structured, semi-structured, and unstructured data.
End-to-End Data Management with Databricks
What do you like best about the product?
I like the fact that Databricks helps me manage data end to end, from ingestion to analytics to reporting and even governance. Within the platform, I'm able to build my pipelines to integrate and adjust data. I can also build dashboards, create reports, share them with my stakeholders, and ensure that the right people have access to the correct datasets and reports. The initial setup was pretty easy, and taking some training on the Databricks Academy was really helpful.
What do you dislike about the product?
The layout of the view of the portal could be nicer if it was a bit more colorful.
What problems is the product solving and how is that benefiting you?
Databricks solves a lot of problems by helping me build data pipelines, create a central source of truth, and maintain data security.
All-in-One Powerhouse with Room for Pricing Clarity
What do you like best about the product?
I like that Databricks is an all-in-one powerhouse where I can do multiple works in one place. It's powerful to manage data from multiple sources and have it in a single UC to manage permissions with row-level security. I also appreciate that I can create experiments, run multiple models, and select the best one from logs, which was difficult on other platforms. Once I learned the setup, it's been easy and comfy to work with.
What do you dislike about the product?
I find it difficult to use the calculator to determine CPU serving endpoint prices because the documentation doesn't explicitly explain this. It only mentions 1 concurrency equals 1 DBU on the Azure page, which isn't clear. The pricing calculator has a single option for serving endpoints, labeled as medium with four DBU, but lacks separate options for GPU or CPU and their concurrency, making it hard to understand how it works properly. Initially, I also felt it was very tough to learn Databricks and manage deployments of workspaces, although it became easier over time.
What problems is the product solving and how is that benefiting you?
Databricks consolidates multiple tools into one platform, making it powerful and convenient. I can manage permissions with row-level security and easily run experiments to select the best models, all in one place.
showing 21 - 30