Sign in Agent Mode
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

10 AWS reviews

External reviews

741 reviews
from and

External reviews are not included in the AWS star rating for the product.


5-star reviews ( Show all reviews )

    Gopi S.

Versatile Platform, But Needs Faster Analysis

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
I like Databricks because it allows me to perform multiple tasks on a single platform, which isn't possible with some other cloud service platforms. This functionality is particularly useful for managing database tasks efficiently and is a capability I can't find in other platforms.
What do you dislike about the product?
I have faced many times when there's a wrong thing in Databricks, and it takes some time to analyze. It could be better if they gave faster and more accurate answers.
What problems is the product solving and how is that benefiting you?
I use Databricks for migration projects. It allows me to perform multiple tasks on a single platform, which I can't do on other cloud platforms.


    Syed F.

Unified Data Engineering, Analytics, and ML on a Scalable Databricks Platform

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
What I like most about Databricks is how it brings data engineering, analytics, and machine learning together in one platform. It streamlines the entire data pipeline—from ingestion and transformation through to serving—so I don’t have to rely on multiple separate tools to get end-to-end workflows done.

Its integration with Spark and Delta Lake is another big plus, making it both scalable and dependable when working with large datasets.
What do you dislike about the product?
One challenge with Databricks is cost management and visibility. Since compute is abstracted through clusters and jobs, it can sometimes be difficult to track and optimize costs without additional monitoring or governance in place.
What problems is the product solving and how is that benefiting you?
Solves the problem of fragmented data ecosystems, where data engineering, analytics, and machine learning are handled in separate tools.


    Janakiraman K.

Databricks Brings Spark, Delta, and ML Together with Effortless Auto-Scaling

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
Databricks is hands down my favorite platform for data engineering because it brings everything together in one place Spark processing, Delta Lake, and ML tools all play nice without the usual headaches. The auto-scaling clusters save tons of time on big ETL jobs, like the SAP integrations I've done, letting me focus on logic instead of babysitting resources. Unity Catalog has been a game changer for governance in our lakehouse setups too.
What do you dislike about the product?
Costs can sneak up fast if you're not watching usage closely, especially with premium features on large pipelines. The notebooks are great for prototyping but get messy in production without strict discipline. Setup for advanced stuff like custom Unity Catalog policies sometimes feels overly complex for what it delivers.
What problems is the product solving and how is that benefiting you?
Databricks tackles key data engineering headaches like scaling massive Spark jobs, data quality issues, and siloed teams by providing a unified lakehouse platform with Delta Lake for ACID transactions and reliable pipelines. When I have a large number of files or tables to process like in supply chain ETL from SAP systems it shines with optimized Delta processing, serverless compute, and Photon engine, slashing run times from days to hours while cutting costs through auto-scaling. This benefits me directly by speeding up project delivery, reducing debugging time on failures, and enabling seamless collaboration with analysts on notebooks without tool switches.


    Sabareeswara S.

All-in-One Databricks Platform with Strong Governance, Fast Spark Performance, and Genie

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
The all-in-one platform eliminates tool sprawl. Unity Catalog gives you governance, lineage, and discoverability without bolting on a separate catalog. The notebook UI is clean and makes iterating on PySpark fast. Genie is the standout AI feature it turns curated tables into natural language interfaces for business users, and the SDK lets you configure it programmatically so it stays maintainable. DLT handles pipeline orchestration well. Performance on Spark workloads is solid, especially with Photon. Integrations with Airflow, S3, and the broader ecosystem are straightforward. For the ROI, consolidating what used to require multiple tools into one platform pays for itself in reduced complexity.
What do you dislike about the product?
Pricing can be hard to predict. Compute costs scale quickly if you're not careful with cluster sizing and SKU selection, and it's not always obvious which workload tier you actually need until you see the bill. The notebook IDE, while functional, still lags behind a real editor for refactoring, multi-file navigation, and code review workflows
What problems is the product solving and how is that benefiting you?
Tool consolidation is the biggest one. Before, you'd need separate systems for ingestion, transformation, warehousing, governance, and serving each with its own learning curve, maintenance overhead, and integration headaches. Databricks collapses that into a single platform. Unity Catalog solves the data governance problem by giving you lineage, access control, and discoverability in one place instead of managing permissions across disconnected systems.


    Jananisree T.

Databricks: A Unified, Scalable Platform for Faster Collaboration and Innovation

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
Databricks stands out because it provides a unified platform that seamlessly combines data engineering, machine learning, and analytics, making collaboration across teams much easier. I especially appreciate how it simplifies working with big data by integrating with popular tools like Apache Spark, offering scalability, and enabling faster experimentation. The collaborative notebooks, strong support for multiple programming languages, and built-in security features make it both powerful and user-friendly. Overall, it helps accelerate innovation by reducing complexity and improving productivity across the entire data lifecycle.
What do you dislike about the product?
One drawback of Databricks is that it can feel overwhelming for new users because of its complexity and steep learning curve. The platform offers a wide range of powerful features, but navigating them effectively often requires significant technical expertise. Additionally, costs can escalate quickly if clusters are not managed carefully, and performance tuning sometimes demands deep knowledge of Spark internals. Integration with certain external tools can also be less seamless compared to other platforms.
What problems is the product solving and how is that benefiting you?
Databricks is solving the challenge of managing and analyzing massive amounts of data by providing a unified platform for data engineering, machine learning, and analytics. It eliminates the need to juggle multiple tools, making workflows more streamlined and collaborative. For me, this means faster access to insights, easier experimentation with models, and reduced complexity in handling big data. The benefit is clear: improved productivity, better collaboration across teams, and quicker decision-making powered by reliable data.


    Sivabalan A.

Databricks: Feature-Rich, User-Friendly, and Keeps Everything in One Platform

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
Among the various platforms I’ve worked with, Databricks stands out as a genuinely cohesive environment. It feels less like a bundle of disconnected features and more like a unified workspace—one that can evolve alongside the teams using it. The interface is intuitive enough to lower the barrier to entry, while still delivering the depth and power needed for heavy-duty engineering.

One of its biggest strengths is how it consolidates the data lifecycle. By bringing engineering, data science, and SQL analytics under one roof, it helps dissolve the silos that often lead to “data drift” and miscommunication between departments. In practice, it also simplifies the underlying infrastructure, replacing a dozen specialized (and sometimes conflicting) tools with a single, clearer source of truth.

Beyond simply “keeping things clean,” the platform also shines when it comes to collaborative transparency. With notebooks and experiments shared in real time, the gap between an initial data idea and a production-ready model can be dramatically shortened. On top of that, its commitment to open standards like Delta Lake means you’re not boxed into a proprietary black box—you’re building on a foundation that aligns with the broader data community’s direction. Overall, it strikes a rare balance: a polished, user-friendly wrapper around some of the most powerful distributed computing engines available today.
What do you dislike about the product?
The “Big Task” Breakdown

When Genie processes a large volume of data, it often ends up sending a huge amount of JSON back to the browser so it can render those tables and visualizations.

Memory overload: Browsers (and especially Chrome) can be real memory hogs. If a Genie response includes a very large result set or a massive execution plan, RAM usage can spike quickly, which can lead to that familiar “Not Responding” hang.

The “DOM” lag: Every row in a table and every line of code becomes an element the browser has to keep track of. As you scroll or type, the browser has to repaint thousands of these elements. When the task is too large, the browser’s main thread can get tied up rendering, and your typing starts to feel like it’s trailing behind by a few seconds.
What problems is the product solving and how is that benefiting you?
You’ve nailed the core reason Databricks is winning over so many data teams: they’re reducing the “integration tax.” In most companies, you can easily lose around 30% of your time just moving data between the “storage” tool, the “processing” tool, and the “BI” tool.

The AI/BI Dashboard is a great example of this broader shift—from a “collection of tools” to a more unified platform.

What began as a basic visualization layer has evolved into a “Compound AI” system. Here’s how it has become so useful:

The “Ask Genie” integration: You’re no longer limited to staring at a static chart. As of 2026, every published dashboard includes an “Ask Genie” button by default. If a stakeholder notices a spike in a line chart, they don’t have to call you; they can right-click the chart and ask, “Genie, why did this drop on Tuesday?” and it will use Agent mode to track down the driver.

Direct-to-warehouse speed: Because it lives inside Databricks, there’s no need to “extract” data to a separate BI server. It queries the data where it already lives (Unity Catalog), which means the dashboard stays as fresh as your last ETL run.

AI-assisted authoring: You can build entire widgets just by describing what you want. Instead of dragging fields around, you can type, “Show me a funnel chart of our sales conversion by region,” and it generates the SQL and the visualization for you.

Deep governance: Since it’s built in, your security policies (row-level security, tags) follow the data automatically. You don’t have to recreate permissions in a separate tool like Tableau or Power BI.


    Hospital & Health Care

Intuitive, Limitless Analytics for End-to-End Data Pipelines

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
It’s very intuitive, and the breadth of data and analytics you can do with it is limitless.

You can create a medallion architecture, create data pipelines, create jobs, dashboards, data governance, etc.
What do you dislike about the product?
I feel like some of the newer releases features can be a bit buggy at times, but after a while those things usually get better.
What problems is the product solving and how is that benefiting you?
We have a data and analytics platform and we use Databricks as our key vendor. Our relationship with them has been great and they’ve been super helpful the whole way.


    Amit D.

Databricks: A True Unified Analytics & AI Platform That Boosts Speed and Reliability

  • March 26, 2026
  • Review provided by G2

What do you like best about the product?
What I like best about Databricks is how it finally delivered what every data engineer/data professional has been wishing for — a true unified analytics and AI platform.
I remember working across five different tools just to get a single pipeline from ingestion to reporting. Databricks collapsed all of that into one environment, and that changed everything for me.
Delta Lake was the first breakthrough. When it arrived around 2020, ACID transactions and time‑travel immediately eliminated the operational pain we used to consider “normal.” If a job corrupted a table, I could roll back to a previous version in seconds instead of spending hours restoring backups. That reliability alone saved multiple downstream failures.
Before Delta existed, our pipelines relied heavily on overwrite patterns because there was no reliable way to apply updates or handle late‑arriving data safely. Overwrites were slow, expensive, and risky — especially for large tables. A single failure during overwrite could leave the table in a half-written, inconsistent state. Processing took longer, compute costs shot up, and recovery often meant manually rebuilding partitions from scratch.
The ROI became obvious as soon as we used Databricks end‑to‑end. Because one platform handles ingestion → transformation → ML → BI → governance, we retired entire categories of legacy tools and reduced operational overhead dramatically.
Then Genie arrived — and it genuinely transformed my day‑to‑day work.
I once needed a PySpark module for data quality checks. Genie generated the full logic — null checks, schema validation, aggregations — in seconds. Instead of spending 30 minutes writing boilerplate, I spent 3 minutes refining the logic. It shifted my focus from syntax to decisions.
Integrations are another strength. Connecting Databricks to S3, SQL Server, and especially Power BI has been seamless. Publishing Delta tables directly to BI models removed the need for brittle extracts and sped up refreshes. Unity Catalog made everything even cleaner with consistent permissions and lineage.
Performance is consistently strong when it matters — heavy joins, window functions, multi‑stage pipelines, or streaming workloads. Serverless compute starts instantly, and workloads scale predictably even under pressure.
Finally, onboarding surprised me. Features like serverless compute, natural‑language queries, AI‑generated code suggestions, and automatic comments make Databricks intuitive even for engineers new to Spark. It feels like the platform actively helps you learn.
In short: Databricks lets me work faster, recover instantly, integrate seamlessly, and scale confidently — all in one place. It’s the rare platform that improves both speed and reliability at the same time.
What do you dislike about the product?
What I dislike most about Databricks is the cost visibility and predictability.
Even as an experienced engineer, it can be difficult to get a straight, real‑time view of what a workflow will cost before running it. Photon vs. standard runtime, autoscaling behaviour, shuffle-heavy operations, DBUs—these can stack up quickly, and cost surprises happen unless you actively monitor and tune everything. A simple pipeline misconfiguration can quietly double your spend.
Another challenge is the rapid pace of new features and changes.
Databricks innovates incredibly fast, which is great, but it also means features may land before documentation, best practices, or governance patterns are fully mature. Sometimes functionality behaves differently across runtimes or cloud providers, and staying on top of everything requires continuous learning and refactoring. This can create team friction and technical debt.

In short: Databricks is exceptional, but the cost model isn’t always transparent, and the rapid feature rollout can introduce operational complexity that teams must actively manage.
What problems is the product solving and how is that benefiting you?
Business : Before adopting Databricks, our aerospace analytics environment — particularly around Customer engine health monitoring — suffered from the same challenges many traditional engineering organisations face.
We had multiple disconnected systems handling telemetry ingestion, fault-code processing, fleet analytics, and maintenance prediction. Data from engine sensors (FADEC, vibration, thermals, oil systems) arrived in different formats and needed heavy manual work just to normalise. Pipelines relied on full overwrites because our legacy setup didn’t support updates or late-arriving data, which made processing slow and expensive.
We struggled with slow ingestion of engine telemetry, inconsistent datasets across engineering teams, and long turnaround times for anomaly detection models.

Architecture challenge: Before using Databricks, we were operating in a fragmented data landscape.
We had multiple systems, disconnected storage layers, and a heavy reliance on overwrite‑based ETL jobs because our old data platform couldn’t support updates, late‑arriving data, or ACID guarantees. This meant pipelines were slow, error‑prone, and expensive. Rolling back bad data could take hours, and data inconsistencies across teams were common.
We struggled with siloed systems, slow pipelines, unreliable data, and high operational cost.

We struggled with manual overwrites and inconsistent data — but now we can use Delta Lake with ACID and time‑travel,
which has resulted in:

Instant rollback from data corruption scenarios
Reliable incremental processing instead of full overwrites
Consistent data consumed across engineering, BI, and ML teams

This reduced our telemetry pipeline processing window from hours to under 30 minutes for a fleet‑wide daily batch..

We struggled with multiple tools and duplicated architectures — but now we have one unified Lakehouse,
which has resulted in:

A single platform for ingestion → transformation → ML → BI → governance
Removal of 3–5 legacy tools (ETL schedulers, BI extracts, legacy ML infra)
Lower maintenance and licensing overhead

We struggled with slow development cycles — but now we can leverage Genie for AI‑assisted engineering,
which has resulted in:

70–80% faster creation of PySpark modules
Automatic generation of schema checks, null checks, and DQ logic
More time spent on decisions, less on boilerplate code

For example, a data quality module that used to take 30 minutes now takes 2–3 minutes to scaffold.

We struggled with inconsistent governance — but now Unity Catalog gives us end‑to‑end visibility,
which has resulted in:

Faster onboarding (reduced from days to minutes)
Centralised permissions, lineage, and audit trails
Stronger compliance alignment

We struggled to scale pipelines and ML workloads — but now we use distributed compute + Photon,
which has resulted in:

Large joins and window operations executing up to 10× faster
Stable handling of terabyte‑scale datasets
Predictable performance even under heavy workloads


    Joseph F.

Databricks Notebooks Make Collaboration Seamless Across Python, SQL, and Scala

  • March 26, 2026
  • Review provided by G2

What do you like best about the product?
Databricks collaborative notebooks are really useful and let me work in whatever language I need to meet my requirements effectively. The ability to mix Python, SQL and even Scala within a dashboard makes collaboration and teamwork much smoothet. I also appreciate how easily it integrates with other tools and cloud platforms, so it fits into my existing workflows without very little friction.
What do you dislike about the product?
I like their customer support and the frequent updates are a big reason this has become my favorite for data management, I also appreciate how well it integrates with external tools like Power BI for reporting its really good.
What problems is the product solving and how is that benefiting you?
Its simplifies cross team collaboration and helps us work through large datasets without having to worry too much about infrastructure or analytics overhead. Calcuations and reporting are fast, which has improved our development cycles and reduced the back and forth between the engineering and analytics teams.


    Ankit K.

Fast, Scalable Spark Processing with a Powerful Unified Analytics Workspace

  • March 12, 2026
  • Review provided by G2

What do you like best about the product?
fast distributed processing with Spark, collaborative notebooks for teams, strong integration with cloud data platforms, scalable data pipelines, unified workspace for data engineering and analytics, handles large datasets efficiently
What do you dislike about the product?
cluster startup time can be slow, costs can increase quickly with heavy workloads, UI can feel complex for new users, debugging distributed jobs is not always straightforward, notebook version control can be tricky
What problems is the product solving and how is that benefiting you?
large-scale data processing, building and managing data pipelines, unified environment for engineering and analytics, faster data transformations, improved scalability for big data workloads