Sign in Agent Mode
Categories
Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

10 AWS reviews

External reviews

768 reviews
from and

External reviews are not included in the AWS star rating for the product.


    Sivabalan A.

Unified Data Engineering, Science, and Analytics in One Collaborative Platform

  • April 02, 2026
  • Review provided by G2

What do you like best about the product?
What I appreciate most about Databricks is its ability to unify data engineering, data science, and analytics on a single platform. The collaborative environment—especially the notebooks and integrated workflows—makes it much easier for teams with different skill levels to work together without constant context-switching.

Another highlight is the integration with popular tools and cloud services that are widely used in the market today, which makes it easier to move data between them. The performance monitoring and job scheduling features help maintain visibility over pipelines, and the Delta Lake support for reliable data management has also been very useful.
What do you dislike about the product?
Cost management is one area that could be improved. While Databricks offers autoscaling and flexible cluster options, it’s easy for resource usage to escalate unexpectedly, especially with large datasets and long-running jobs. Keeping costs predictable often requires careful oversight and a solid understanding of the platform’s pricing model.

Additionally, some of the more advanced features—such as fine-grained access controls and more complex job orchestration—can feel less intuitive. The documentation is extensive, but it occasionally leaves gaps that end up requiring trial and error.
What problems is the product solving and how is that benefiting you?
Databricks addresses several key challenges in modern data workflows, particularly around scalability, data reliability, and collaborative analytics. One major problem it solves is managing and processing large-scale datasets efficiently. By leveraging Apache Spark’s distributed computing framework, Databricks enables parallelized ETL pipelines and large-scale data transformations that would be impractical on traditional infrastructure.

Another challenge is ensuring data consistency and reliability across pipelines. With Delta Lake, Databricks provides ACID-compliant storage, versioned tables, and schema enforcement, which reduces data errors and simplifies data governance. This is especially beneficial when multiple teams are working on different stages of data pipelines at the same time.

Databricks also helps solve the problem of fragmented workflows for data scientists and engineers. Its unified environment supports multiple languages (Python, SQL, R, Scala) and includes integrated machine learning with MLFlow, making it easier to collaborate and move from data preparation to analytics and ML in one place.


    Yuvi M.

Databricks Streamlined Our ETL Migration with Delta Lake and Unified Analytics

  • April 02, 2026
  • Review provided by G2

What do you like best about the product?
Databricks transformed my day-to-day workflow, taking me from constant SQL Server/ADF headaches to scalable, unified analytics. Migrating stored procedures into Spark SQL notebooks was surprisingly smooth, and using Delta Lake MERGE instead of complicated UPDATE logic saved me weeks of rewriting.

The most helpful features for me have been Delta Lake’s ACID transactions and schema evolution, which handle my sparse shipment loads really well. Unity Catalog has also been a big win because it eliminates the back-and-forth of RDS access tickets by enabling governed table sharing. On top of that, Genie turns natural-language requests into production-ready Spark SQL almost instantly.

On the upside, autoscaling clusters have cut costs by about 70% compared with ADF’s always-on pipelines. I also like being able to combine PySpark and SQL in a single notebook, which makes complex joins and subqueries much easier to manage. And I don’t miss the old NOLOCK hint debates—built-in optimizations take care of that.

If you’re migrating ETL pipelines, Databricks removes a lot of the SQL-to-cloud friction while still scaling to enterprise volumes without breaking the bank.
What do you dislike about the product?
The cluster reconnects fairly often, which can be disruptive during active work sessions. Also, when I run complex or heavy queries, I notice clear lag in response times, and that slowdown can hurt productivity.
What problems is the product solving and how is that benefiting you?
Databricks has helped us centralize our data engineering and analytics workflows into a single, unified platform. It addresses the challenge of managing large-scale data pipelines by enabling our team to process and transform massive datasets efficiently with Spark. The collaborative notebook environment has also boosted productivity, making it easier for data engineers and analysts to work together. Overall, it has significantly reduced the time we spend on data preparation and has allowed us to focus more on deriving insights.


    Janani D.

Scalable Power with Manageable Trade-offs

  • April 02, 2026
  • Review provided by G2

What do you like best about the product?
The collaborative notebooks are hands-down my favorite part of Databricks. I love being able to jump into a notebook with my team, tweak Spark SQL queries live on those massive shipment datasets, and watch everything sync instantly—without any version-control.

It beats emailing notebooks back and forth or wrestling with merge conflicts; it feels like pair programming, but for data pipelines. And when you pair that with Delta Lake’s reliability for keeping my ETL jobs rock-solid on intermodal lane data, it ends up being a huge workflow saver.

Top notebook perks for me are the real-time editing and sharing that keeps everyone aligned during debugging, the built-in version history that lets me roll back mistakes quickly, and the seamless Spark integration so I’m not constantly context-switching when doing big data transforms.
What do you dislike about the product?
One key drawback is the cost management—charges can accumulate rapidly if clusters are left running, requiring careful monitoring of DBU usage and auto-termination settings.

Debugging intricate Spark job failures in notebooks often involves sifting through extensive log output, which extends resolution time considerably. Additionally, the UI experiences occasional performance delays under high workloads, impacting efficiency when responsiveness is essential.
What problems is the product solving and how is that benefiting you?
Databricks addresses core challenges in managing large-scale data processing, such as scalability limitations in traditional databases and the complexity of integrating disparate tools for ETL workflows. It enables distributed Spark processing across clusters to handle massive datasets efficiently, while Delta Lake provides ACID-compliant storage to ensure data integrity amid evolving schemas or concurrent updates.
This benefits me by streamlining pipelines that feed BI tools, reducing processing times from days to hours and minimizing manual infrastructure oversight. Collaborative notebooks further enhance team productivity through real-time editing, eliminating version control issues and accelerating development cycles.


    Information Technology and Services

Databricks Unifies Data and AI for Effortless ML at Scale

  • April 02, 2026
  • Review provided by G2

What do you like best about the product?
What I like most about Databricks is how it brings data and AI into one place, so you’re not jumping between tools.
It makes building and scaling ML models feel much more straightforward, especially with built-in experiment tracking.
The integration with Apache Spark helps handle large datasets without extra setup.
Overall, it just reduces the friction between raw data and actually getting useful AI outcomes.
What do you dislike about the product?
One thing I find challenging with Databricks is cost visibility-it can scale quickly, and predicting spend isn’t always straightforward.
There’s also a bit of a learning curve, especially when working across notebooks, jobs, and cluster configs.
And for simpler use cases, it can feel like overkill compared to lighter-weight solutions.
What problems is the product solving and how is that benefiting you?
Databricks solves the problem of fragmented data and AI workflows by bringing everything-data engineering, analytics, and ML-into one platform.
It eliminates the need to move data across multiple systems, which reduces latency and pipeline complexity.
For me, that means faster experimentation and smoother deployment of AI models without worrying about infrastructure.
Overall, it helps focus more on solving business problems rather than managing tools.


    Pathan I.

Databricks: A Powerful Unified Platform with Room for Cost and Configuration Optimization

  • April 01, 2026
  • Review provided by G2

What do you like best about the product?
What I like best about Databricks is its ability to unify data engineering, analytics, and machine learning on a single collaborative platform.
What do you dislike about the product?
What I dislike about Databricks is that it can become expensive if clusters are not properly managed, especially when left running idle
What problems is the product solving and how is that benefiting you?
Databricks solves the problem of managing and processing large‑scale data by unifying data engineering, analytics, and machine learning on a single platform.


    Dinesh Kumar D.

Efficient ETL and AI-Driven Data Validation

  • April 01, 2026
  • Review provided by G2

What do you like best about the product?
I like the AI-supported environment in Databricks, which I use extensively for ETL tasks and experimental AIBD dashboards. It's really helpful for fixing code issues and handling logic implementation efficiently. The DLT feature is also a great addition for supporting streaming data. I find Delta Lake very useful for reliable data handling with its ACID transactions, schema enforcement, and reliable versioned data. Notebooks make it easy to develop, test, and debug data logic interactively. I also appreciate the workflows for automating and scheduling pipelines, which improve reliability and reduce manual effort. Databricks is cost-effective compared to other platforms like Synapse and Snowflake, and it's easy to track versions and handle failures. The initial setup was straightforward, with workspace creation and cluster setup being fairly easy for my team.
What do you dislike about the product?
The DLT, one of my personal experiences, as when set on DLT for one flow, I could not create another flow with the same table used previously. On a business aspect, it's normal to use one table for different reporting aspects as a base table and require different refresh timing.
What problems is the product solving and how is that benefiting you?
I perform ETL tasks and reporting with Databricks. It helps set up streaming data using DLT, and features like Delta Lake enhance data quality. Notebooks support interactive logic development, while workflows automate pipeline scheduling, reducing manual effort.


    Dharun T.

Streamlined, Collaborative Data Workflows with Powerful Performance

  • April 01, 2026
  • Review provided by G2

What do you like best about the product?
What I like most about Databricks is how it streamlines the entire data workflow by bringing processing, analysis, and machine learning into one platform. The collaborative notebook environment makes it easy to share code, context, and reasoning with teammates, which helps everyone stay aligned. It also performs strongly on large datasets while abstracting away most of the cluster management, so I can focus on solving the problem rather than dealing with infrastructure. On top of that, centralized access control and clear visibility into data usage support responsible data governance, offering a solid balance between power and ease of use.
What do you dislike about the product?
Databricks has a few downsides, although many of them feel more like trade-offs than outright negatives. My biggest concern is cost: if clusters aren’t managed carefully, expenses can climb quickly, even though the platform can scale very efficiently when it’s tuned properly. There’s also a real learning curve with Spark and distributed computing concepts, and debugging or performance tuning can be more involved than with simpler tools. Lastly, because it’s a managed service, you give up some low-level control compared with self-hosted systems, but the upside is that it takes a lot of the operational and infrastructure work off your plate.
What problems is the product solving and how is that benefiting you?
Because my client needs secure, reusable code, Databricks helps us write Python efficiently while applying OOP principles and design patterns. It also makes it straightforward to extend functionality over time and build custom code that interacts with APIs and databases.


    Magesh kumar N.

Effortless Setup, Minimal Configuration Required

  • April 01, 2026
  • Review provided by G2

What do you like best about the product?
I use Databricks to create pipelines and data models, and I really like its minimal need for configuration. It helps me reduce the time spent on configuring accounts and processes. Databricks manages these tasks well, making my work easier. The initial setup was straightforward too, thanks to the guidance provided through the playground feature.
What do you dislike about the product?
My suggestion is to have a Genie update more as to have validations and have the table mapping in it.
What problems is the product solving and how is that benefiting you?
I find Databricks makes my work easy by minimizing the need for configuration and automating workflows, saving me time.


    Akanksh M.

All-in-One Platform for Data Engineering, ML, AI, and Data Management

  • April 01, 2026
  • Review provided by G2

What do you like best about the product?
It brings all the tech stacks together in one platform—data engineering, machine learning, AI, and data management—so everything is in one place. It also includes advanced features that make the platform feel complete and capable.
What do you dislike about the product?
We need more open-source, direct connectors to both legacy and current-generation platforms to enable better data extraction. These connectors should support real-time extraction as well as real-time data rendering.
What problems is the product solving and how is that benefiting you?
It brings all types of data into one place, which makes data and access management easier. I can build data warehouses and then downstream the data to AI BI dashboards and ML models, which is very useful. Special features like the feature store, serving endpoints, AI BI dashboard, and Genie help me understand the data, work with it more effectively, and ultimately reach my goals.


    Vijayaramuprawin V.

All-in-One Platform That Helps Us Iterate Fast and Deploy with Confidence

  • April 01, 2026
  • Review provided by G2

What do you like best about the product?
We use Databricks daily as our core data platform for building and running pipelines across a medallion architecture, from extracting data out of SAP and Arkieva all the way to reporting-ready datasets. The notebook experience is intuitive, the feature set is massive, and Asset Bundles have made our CI/CD story with Azure DevOps really solid. Integration with cloud services was smooth, and once things are set up they just work. The learning curve can be steep for newer team members, especially around things like Unity Catalog and DABs, and costs can creep up if you're not staying on top of cluster configurations. Support is decent and the docs are strong enough that we rarely need to open a ticket. Overall, it's a powerful platform that does a lot under one roof, and it's hard to imagine our data engineering workflow without it.
What do you dislike about the product?
The cost can creep up fast if you're not careful with cluster sizing and job configurations, so it takes some effort to keep things optimized. Also, the learning curve for newer team members can be steep, especially around things like Asset Bundles, Unity Catalog, and getting the CI/CD pieces wired up properly.
What problems is the product solving and how is that benefiting you?
Databricks is solving the problem of having fragmented data spread across multiple systems like SAP and Arkieva by giving us one unified platform to extract, transform, and serve it all. That means our business teams get clean, reliable, reporting-ready data without us having to juggle a bunch of separate tools, and we can deploy and manage everything consistently across environments with confidence.