Databricks Data Intelligence Platform
Databricks, Inc.External reviews
761 reviews
from
and
External reviews are not included in the AWS star rating for the product.
Streamlined Data Processing with Unmatched Speed
What do you like best about the product?
I use Databricks for real-time data ingestion and processing as well as batch processing. I find it easy to use with PySpark, and I appreciate that it serves as a single platform for both real-time and batch processing. The in-memory processing drastically reduces processing time, and working with dataframes makes handling structured data straightforward. I like the fast execution and the ability to clean, massage, and manipulate data all on the same platform. It's also easy to deploy, and I enjoy the smooth CI pipeline with just one click. The initial setup was quite easy, and the product support made it a cakewalk.
What do you dislike about the product?
Databricks should come up with agentic framework integrated, making it a single stop for Data and AI.
What problems is the product solving and how is that benefiting you?
Databricks offers an easy-to-use platform for both realtime and batch processing. It integrates easily with PySpark and supports in-memory processing, significantly reducing processing time. Dataframes make handling structured data simpler.
Great UI and a Straightforward, Linear Learning Curve.
What do you like best about the product?
The UI is great compared to other providers. It’s easy to work with, and the learning curve feels linear and straightforward.
What do you dislike about the product?
Consumption-based costs are on the higher side, and it can be difficult for users who aren’t proficient in Python or Spark.
What problems is the product solving and how is that benefiting you?
A centralised data warehouse, with notebooks running on top of it for further analysis and ML use cases.
Unified Platform with Scalability and ML Power for Big Data
What do you like best about the product?
I like Databricks for its unified platform, which brings data engineering, analytics, and machine learning together. It simplifies workflow scaling and is easy for handling big data. The collaboration across the team is much smoother, which I really appreciate.
What do you dislike about the product?
I would say cost transparency maybe. User-based pricing can be hard to predict. So the initial setup and cluster configuration can feel complex. Better documentation for that and UI could be more intuitive in some areas.
What problems is the product solving and how is that benefiting you?
I use Databricks to sort ETL pipelines, handle large-scale data efficiently, reduce data processing time, and eliminate data silos. The unified platform improves collaboration between data engineers and scientists, simplifying workflows and making big data management smoother.
Seamless Integration and Scalable Performance with Room for UI Improvement
What do you like best about the product?
I use Databricks to build ETL pipelines and process large-scale data with Spark. I like Databricks most for its seamless integration with Apache Spark, collaborative notebooks, and its ability to handle large-scale data processing efficiently in a unified platform. The seamless Apache Spark integration lets me process huge datasets quickly without worrying about cluster setup, while collaborative notebooks make it easy to work with my team in real-time. The scalable architecture ensures reliable performance even with heavy data workloads. The initial setup of Databricks was fairly straightforward, especially with cloud integration.
What do you dislike about the product?
The UI can feel a bit cluttered at times, cluster startup times can be slow, and the pricing can get expensive for smaller projects or prolonged usage.
What problems is the product solving and how is that benefiting you?
I use Databricks to efficiently process large-scale data, simplify ETL workflows, and collaborate with my team in a unified environment, gaining faster data-driven insights.
Databricks Unifies Engineering and Analytics for Scalable Spark Pipelines
What do you like best about the product?
What I like best about Databricks is that it brings data engineering, processing, and analytics into one platform.
From my perspective, it makes it much easier to build and manage scalable pipelines with Spark without worrying too much about infrastructure.
From my perspective, it makes it much easier to build and manage scalable pipelines with Spark without worrying too much about infrastructure.
What do you dislike about the product?
What I dislike about Databricks is that cost control can get tricky if clusters are not managed properly.
Also, debugging distributed jobs is not always straightforward, and sometimes the UI feels a bit heavy when you just want quick insights
Also, debugging distributed jobs is not always straightforward, and sometimes the UI feels a bit heavy when you just want quick insights
What problems is the product solving and how is that benefiting you?
Databricks solves the problem of handling large scale data processing and fragmented tools.
For me, it brings ETL, streaming, and analytics into one place, which reduces pipeline complexity and speeds up development and troubleshooting.
For me, it brings ETL, streaming, and analytics into one place, which reduces pipeline complexity and speeds up development and troubleshooting.
Powerful Unified Analytics with Seamless Governance and Effortless Scaling
What do you like best about the product?
What I like best about Databricks is its powerful and unified analytics ecosystem. Features like Unity Catalog and Metastore make data governance and access control seamless, while the Lakehouse architecture combines the best of data lakes and warehouses. PySpark support, dbutils, and collaborative workspaces make development efficient, and serverless compute simplifies scaling without infrastructure overhead.
What do you dislike about the product?
What I dislike about Databricks is the slow startup time of all-purpose clusters, which can interrupt workflow and reduce productivity. Additionally, Git integration can feel a bit sluggish at times, especially during commits or syncing, making version control less seamless than expected.
What problems is the product solving and how is that benefiting you?
Databricks solves the challenge of managing end-to-end data workflows by providing a unified platform for data engineering, data science, and analytics. It allows seamless data processing, transformation, and model development within a single environment.
This benefits me by simplifying my workflow as both a data engineer and data scientist, reducing the need to switch between tools. Additionally, its integration with Azure Data Factory enables smooth job orchestration and triggering for higher environments, making deployments more efficient and reliable.
This benefits me by simplifying my workflow as both a data engineer and data scientist, reducing the need to switch between tools. Additionally, its integration with Azure Data Factory enables smooth job orchestration and triggering for higher environments, making deployments more efficient and reliable.
Unified Data Platform, Minor Cost and Complexity Challenges
What do you like best about the product?
I like that Databricks provides a unified platform for data engineering and data science, eliminating friction across teams and enhancing the ability to accelerate development and deployments. It works especially well for end-to-end CICD pipelines.
What do you dislike about the product?
Well, in terms of what can be improved, I think, perhaps the cost management. If this can be looked into to make it more cost efficient for users, it will go a long way. And in addition to that, operational complexity sometimes presents a complex platform for new users to navigate easily. So if this can be addressed, then I think it should be a lot easier for engineers to work with.
What problems is the product solving and how is that benefiting you?
I use Databricks for scalable workflows across multi-cloud environments, solving data silo unification and minimizing bottlenecks in complex data processing. It optimizes cost and governance while providing a collaborative workspace, real time data ingestion, and enhanced system reliability and performance.
Unified Data Workflows with Databricks
What do you like best about the product?
I really like Databricks for its collaborative lake house environment, which has been key in unifying our data engineering and machine learning workflows. It bridges the gap between our engineering and analytics teams, allowing us to run BI and AI on a single platform. Additionally, the initial setup was surprisingly fast from a workspace perspective, especially with the native integration in Azure.
What do you dislike about the product?
The learning curve is quite steep for non-engineers. We've also had to be very diligent with cost monitoring as auto scaling clusters quickly lead to unexpected expenses if not managed strictly.
What problems is the product solving and how is that benefiting you?
Databricks solved our data stack fragmentation by unifying storage lakes and warehouses. It bridged the gap between engineering and analytics, letting us run BI and AI on a single platform.
Databricks: Unified Lakehouse Platform with Powerful Spark Performance
What do you like best about the product?
i am working as a Data management specialist and using databricks regularly for handling data pipelines, large scale data processing, and governance tasks, i like most is that databricks provides a single unified platform for data engineering , analytics and AI , instead of using multiple tools. everything is available in one place, the lakehouse architecture is very useful because it combines data warehouse and data lake capabilities, so we can manage both structured and unstructured data efficiently. performance is very strong, especially with apache spark, it can process very large datasets quickly. i also like the collaborative notebooks where teams can work together using SQL, python or scala.
What do you dislike about the product?
one issue is that it has a steep learning curve, especially for new users who are not familiar with spark or distributed systems. cost management can also be challenging , it clustered are not optimized properly it can become expensive, sometimes too many features and configuration can makes it complex to manage for smaller teams. sometimes the platform feel complex. with many feature and configuration which can be difficult for smaller teams to manages. it it a powerful platform, but complexity and cost control are the main challenges in daily use.
What problems is the product solving and how is that benefiting you?
databricks solves the problem of managing large scale data processing and multiple data tools in a single platform, before using databricks data was spread across different system. and we has to use separate tools for ETL, storage and analytics, this made workflow complex and difficult to manage, databricks brings everything together in one place, so we can build data pipeline , process large datasets, and run analytics without switching tools. it also handles big data efficiently using distributed processing, which reduces processing time and improves performance, for me it has made data workflows more organized, reduces manual effort, and improved data reliability. it helps in faster data processing, better collaboration and more efficient data management.
Seamless, Collaborative Platform That Scales for Data Engineering and ML
What do you like best about the product?
Databricks' ability to seamlessly integrate everything is what I find most appealing. When working on actual projects, it really makes a big difference that you don't have to switch between several tools for data engineering, analysis, and machine learning.
The collaborative element is very noteworthy. Teams may easily collaborate without things becoming messy thanks to the notebooks' fluid and dynamic feel. For significant data work, it resembles Google Docs almost exactly.
I also really like how efficiently it manages large amounts of data without making it seem difficult. Even when working with large datasets, the platform feels user-friendly and can be scaled up when necessary.
Additionally, it makes perfect sense from an AI/ML standpoint. You are able to construct,
The collaborative element is very noteworthy. Teams may easily collaborate without things becoming messy thanks to the notebooks' fluid and dynamic feel. For significant data work, it resembles Google Docs almost exactly.
I also really like how efficiently it manages large amounts of data without making it seem difficult. Even when working with large datasets, the platform feels user-friendly and can be scaled up when necessary.
Additionally, it makes perfect sense from an AI/ML standpoint. You are able to construct,
What do you dislike about the product?
Databricks can initially feel a little overwhelming, which is something I don't like. Clusters, notebooks, jobs, workflows—there's a lot going on, and if you're new, it takes some time to truly grasp how everything works together.
Cost control is another drawback. It is undoubtedly strong, but expenses might quickly increase if you are careless with cluster usage or auto-scaling settings. To keep everything under control, you need to exercise some self-control and keep an eye on things.
Databricks can initially feel a little overwhelming, which is something I don't like. Clusters, notebooks, jobs, workflows—there's a lot going on, and if you're new, it takes some time to truly grasp how everything works together.
Cost control is another drawback. It is undoubtedly strong, but expenses might quickly increase if you are careless with cluster usage or auto-scaling settings. To keep everything under control, you need to exercise some self-control and keep an eye on things.
Cost control is another drawback. It is undoubtedly strong, but expenses might quickly increase if you are careless with cluster usage or auto-scaling settings. To keep everything under control, you need to exercise some self-control and keep an eye on things.
Databricks can initially feel a little overwhelming, which is something I don't like. Clusters, notebooks, jobs, workflows—there's a lot going on, and if you're new, it takes some time to truly grasp how everything works together.
Cost control is another drawback. It is undoubtedly strong, but expenses might quickly increase if you are careless with cluster usage or auto-scaling settings. To keep everything under control, you need to exercise some self-control and keep an eye on things.
What problems is the product solving and how is that benefiting you?
The fragmentation issue in the data and AI workflow is primarily resolved by Databricks. In the past, data storage, processing, analysis, and machine learning were usually done using different tools, and getting them all to cooperate was frequently difficult and time-consuming. Databricks eliminates a lot of the friction by combining all of it into a single platform.
That makes the developing process much more seamless for me. I don't have to worry about compatibility problems or waste time switching between environments. I can perform transformations, clean data, and create models all in one location, which reduces setup time and maintains organization.
It also addresses the difficulty of handling massive amounts of data.
I can rely on its distributed computing capabilities to manage demanding workloads rather than worrying about infrastructure or performance optimization from scratch. This allows me to concentrate less on resource management and more on finding a solution to the real issue.
Collaboration is another major issue it resolves. Sharing code, findings, and experiments can get disorganized in team environments. Because everything is consolidated with Databricks, it's simpler to work together, monitor changes, and maintain alignment.
All things considered, it helps me by cutting down on complexity, saving time, and allowing me to concentrate more on developing solutions—whether they be analytics, machine learning models, or data pipelines—instead of handling the overhead of maintaining numerous tools and platforms.
That makes the developing process much more seamless for me. I don't have to worry about compatibility problems or waste time switching between environments. I can perform transformations, clean data, and create models all in one location, which reduces setup time and maintains organization.
It also addresses the difficulty of handling massive amounts of data.
I can rely on its distributed computing capabilities to manage demanding workloads rather than worrying about infrastructure or performance optimization from scratch. This allows me to concentrate less on resource management and more on finding a solution to the real issue.
Collaboration is another major issue it resolves. Sharing code, findings, and experiments can get disorganized in team environments. Because everything is consolidated with Databricks, it's simpler to work together, monitor changes, and maintain alignment.
All things considered, it helps me by cutting down on complexity, saving time, and allowing me to concentrate more on developing solutions—whether they be analytics, machine learning models, or data pipelines—instead of handling the overhead of maintaining numerous tools and platforms.
showing 1 - 10