Sign in Agent Mode
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

10 AWS reviews

External reviews

692 reviews
from and

External reviews are not included in the AWS star rating for the product.


5-star reviews ( Show all reviews )

    Deeraj R.

Databricks’ Unified Platform: Fast SQL, Streamlined Pipelines, and Context-Aware AI

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
The unified platform experience is what keeps me on Databricks. Having notebooks, pipelines, SQL warehouses, ML, and governance all in one place under Unity Catalog means I’m not constantly stitching together five different tools just to get work done.

Lakeflow Pipelines (formerly DLT) makes it straightforward to build medallion-architecture pipelines, and the Photon engine delivers real performance gains on SQL workloads without requiring any code changes. Recent additions like Genie Code and background agents also show they’re serious about agentic AI—it doesn’t feel like a bolt-on copilot, because it can actually understand your data context through Unity Catalog. Serverless compute has been another big quality-of-life improvement as well, since I no longer have to wait for cluster spin-up when I just want to run quick, ad hoc queries.
What do you dislike about the product?
Cost management can be tricky—DBUs add up quickly if you’re not careful with cluster sizing and warehouse auto-scaling. The pricing model also isn’t always transparent, especially when you’re mixing serverless and classic compute.

Unity Catalog is powerful, but the initial setup and the migration from legacy HMS can be painful, particularly for large orgs with years of existing Hive metastore objects. The documentation is generally good, yet it sometimes lags behind new feature releases. On top of that, the workspace UI can feel sluggish at times, especially when you’re working with a large number of assets.
What problems is the product solving and how is that benefiting you?
Before Databricks, our data stack was fragmented — separate tools for ETL, analytics, ML, and governance. That meant constant context-switching, duplicated data, and governance gaps. Databricks consolidates all of that into one lakehouse platform. Delta Lake gives us reliable ACID transactions on the data lake, Unity Catalog handles lineage and access control across the board, and SQL warehouses let our analysts self-serve without needing a separate data warehouse product. It's cut our pipeline development time significantly and made data governance something we can actually enforce consistently instead of hoping for the best.


    Naveena P.

Databricks Unifies Data Engineering, Science, and Analytics Exceptionally Well

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
The ability to converge data engineering, data science, and analytics on a single platform without compromising on governance, performance, or flexibility is still rare in the industry. Databricks executes this exceptionally well.
What do you dislike about the product?
Reducing the spinning time of all purpose clusters and job clusters. It would be more usefula nd helpful if it starts as quick as serverless
What problems is the product solving and how is that benefiting you?
In enterprise banking, where regulatory compliance, data accuracy, and operational resilience are non-negotiable, Databricks is solving some of our most critical challenges. As a Lead Data Engineer managing end-to-end ETL pipelines, dashboard delivery, monitoring alerts, and data governance for a major banking client, the platform has become the backbone of our modern data architecture. Databricks unifies our fragmented data landscape through Delta Lake and Unity Catalog, giving us ACID-compliant transactions for reliable ETL, automated lineage for audit-ready governance, and fine-grained access controls to protect sensitive PII and financial data—all while enabling seamless schema evolution to handle the constant changes in source systems. This directly translates to faster, more trustworthy reporting: our dashboards in Power BI and Tableau now pull from a single source of truth, eliminating metric disputes between Risk, Finance, and Compliance teams. On the operational side, native alerting integrated with Slack and PagerDuty, combined with Databricks System Tables for observability, lets us proactively catch data quality issues or SLA breaches before they impact business decisions—reducing incident resolution time by over 60%. Ultimately, Databricks isn't just improving our engineering efficiency; it's enabling us to innovate responsibly in a highly regulated environment, delivering trusted insights at scale while keeping auditors confident and stakeholders aligned.


    Syed F.

Unified Data Engineering, Analytics, and ML on a Scalable Databricks Platform

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
What I like most about Databricks is how it brings data engineering, analytics, and machine learning together in one platform. It streamlines the entire data pipeline—from ingestion and transformation through to serving—so I don’t have to rely on multiple separate tools to get end-to-end workflows done.

Its integration with Spark and Delta Lake is another big plus, making it both scalable and dependable when working with large datasets.
What do you dislike about the product?
One challenge with Databricks is cost management and visibility. Since compute is abstracted through clusters and jobs, it can sometimes be difficult to track and optimize costs without additional monitoring or governance in place.
What problems is the product solving and how is that benefiting you?
Solves the problem of fragmented data ecosystems, where data engineering, analytics, and machine learning are handled in separate tools.


    Janakiraman K.

Databricks Brings Spark, Delta, and ML Together with Effortless Auto-Scaling

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
Databricks is hands down my favorite platform for data engineering because it brings everything together in one place Spark processing, Delta Lake, and ML tools all play nice without the usual headaches. The auto-scaling clusters save tons of time on big ETL jobs, like the SAP integrations I've done, letting me focus on logic instead of babysitting resources. Unity Catalog has been a game changer for governance in our lakehouse setups too.
What do you dislike about the product?
Costs can sneak up fast if you're not watching usage closely, especially with premium features on large pipelines. The notebooks are great for prototyping but get messy in production without strict discipline. Setup for advanced stuff like custom Unity Catalog policies sometimes feels overly complex for what it delivers.
What problems is the product solving and how is that benefiting you?
Databricks tackles key data engineering headaches like scaling massive Spark jobs, data quality issues, and siloed teams by providing a unified lakehouse platform with Delta Lake for ACID transactions and reliable pipelines. When I have a large number of files or tables to process like in supply chain ETL from SAP systems it shines with optimized Delta processing, serverless compute, and Photon engine, slashing run times from days to hours while cutting costs through auto-scaling. This benefits me directly by speeding up project delivery, reducing debugging time on failures, and enabling seamless collaboration with analysts on notebooks without tool switches.


    Sabareeswara S.

All-in-One Databricks Platform with Strong Governance, Fast Spark Performance, and Genie

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
The all-in-one platform eliminates tool sprawl. Unity Catalog gives you governance, lineage, and discoverability without bolting on a separate catalog. The notebook UI is clean and makes iterating on PySpark fast. Genie is the standout AI feature it turns curated tables into natural language interfaces for business users, and the SDK lets you configure it programmatically so it stays maintainable. DLT handles pipeline orchestration well. Performance on Spark workloads is solid, especially with Photon. Integrations with Airflow, S3, and the broader ecosystem are straightforward. For the ROI, consolidating what used to require multiple tools into one platform pays for itself in reduced complexity.
What do you dislike about the product?
Pricing can be hard to predict. Compute costs scale quickly if you're not careful with cluster sizing and SKU selection, and it's not always obvious which workload tier you actually need until you see the bill. The notebook IDE, while functional, still lags behind a real editor for refactoring, multi-file navigation, and code review workflows
What problems is the product solving and how is that benefiting you?
Tool consolidation is the biggest one. Before, you'd need separate systems for ingestion, transformation, warehousing, governance, and serving each with its own learning curve, maintenance overhead, and integration headaches. Databricks collapses that into a single platform. Unity Catalog solves the data governance problem by giving you lineage, access control, and discoverability in one place instead of managing permissions across disconnected systems.


    Jananisree T.

Databricks: A Unified, Scalable Platform for Faster Collaboration and Innovation

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
Databricks stands out because it provides a unified platform that seamlessly combines data engineering, machine learning, and analytics, making collaboration across teams much easier. I especially appreciate how it simplifies working with big data by integrating with popular tools like Apache Spark, offering scalability, and enabling faster experimentation. The collaborative notebooks, strong support for multiple programming languages, and built-in security features make it both powerful and user-friendly. Overall, it helps accelerate innovation by reducing complexity and improving productivity across the entire data lifecycle.
What do you dislike about the product?
One drawback of Databricks is that it can feel overwhelming for new users because of its complexity and steep learning curve. The platform offers a wide range of powerful features, but navigating them effectively often requires significant technical expertise. Additionally, costs can escalate quickly if clusters are not managed carefully, and performance tuning sometimes demands deep knowledge of Spark internals. Integration with certain external tools can also be less seamless compared to other platforms.
What problems is the product solving and how is that benefiting you?
Databricks is solving the challenge of managing and analyzing massive amounts of data by providing a unified platform for data engineering, machine learning, and analytics. It eliminates the need to juggle multiple tools, making workflows more streamlined and collaborative. For me, this means faster access to insights, easier experimentation with models, and reduced complexity in handling big data. The benefit is clear: improved productivity, better collaboration across teams, and quicker decision-making powered by reliable data.


    Hospital & Health Care

Intuitive, Limitless Analytics for End-to-End Data Pipelines

  • March 27, 2026
  • Review provided by G2

What do you like best about the product?
It’s very intuitive, and the breadth of data and analytics you can do with it is limitless.

You can create a medallion architecture, create data pipelines, create jobs, dashboards, data governance, etc.
What do you dislike about the product?
I feel like some of the newer releases features can be a bit buggy at times, but after a while those things usually get better.
What problems is the product solving and how is that benefiting you?
We have a data and analytics platform and we use Databricks as our key vendor. Our relationship with them has been great and they’ve been super helpful the whole way.


    Amit D.

Databricks: A True Unified Analytics & AI Platform That Boosts Speed and Reliability

  • March 26, 2026
  • Review provided by G2

What do you like best about the product?
What I like best about Databricks is how it finally delivered what every data engineer/data professional has been wishing for — a true unified analytics and AI platform.
I remember working across five different tools just to get a single pipeline from ingestion to reporting. Databricks collapsed all of that into one environment, and that changed everything for me.
Delta Lake was the first breakthrough. When it arrived around 2020, ACID transactions and time‑travel immediately eliminated the operational pain we used to consider “normal.” If a job corrupted a table, I could roll back to a previous version in seconds instead of spending hours restoring backups. That reliability alone saved multiple downstream failures.
Before Delta existed, our pipelines relied heavily on overwrite patterns because there was no reliable way to apply updates or handle late‑arriving data safely. Overwrites were slow, expensive, and risky — especially for large tables. A single failure during overwrite could leave the table in a half-written, inconsistent state. Processing took longer, compute costs shot up, and recovery often meant manually rebuilding partitions from scratch.
The ROI became obvious as soon as we used Databricks end‑to‑end. Because one platform handles ingestion → transformation → ML → BI → governance, we retired entire categories of legacy tools and reduced operational overhead dramatically.
Then Genie arrived — and it genuinely transformed my day‑to‑day work.
I once needed a PySpark module for data quality checks. Genie generated the full logic — null checks, schema validation, aggregations — in seconds. Instead of spending 30 minutes writing boilerplate, I spent 3 minutes refining the logic. It shifted my focus from syntax to decisions.
Integrations are another strength. Connecting Databricks to S3, SQL Server, and especially Power BI has been seamless. Publishing Delta tables directly to BI models removed the need for brittle extracts and sped up refreshes. Unity Catalog made everything even cleaner with consistent permissions and lineage.
Performance is consistently strong when it matters — heavy joins, window functions, multi‑stage pipelines, or streaming workloads. Serverless compute starts instantly, and workloads scale predictably even under pressure.
Finally, onboarding surprised me. Features like serverless compute, natural‑language queries, AI‑generated code suggestions, and automatic comments make Databricks intuitive even for engineers new to Spark. It feels like the platform actively helps you learn.
In short: Databricks lets me work faster, recover instantly, integrate seamlessly, and scale confidently — all in one place. It’s the rare platform that improves both speed and reliability at the same time.
What do you dislike about the product?
What I dislike most about Databricks is the cost visibility and predictability.
Even as an experienced engineer, it can be difficult to get a straight, real‑time view of what a workflow will cost before running it. Photon vs. standard runtime, autoscaling behaviour, shuffle-heavy operations, DBUs—these can stack up quickly, and cost surprises happen unless you actively monitor and tune everything. A simple pipeline misconfiguration can quietly double your spend.
Another challenge is the rapid pace of new features and changes.
Databricks innovates incredibly fast, which is great, but it also means features may land before documentation, best practices, or governance patterns are fully mature. Sometimes functionality behaves differently across runtimes or cloud providers, and staying on top of everything requires continuous learning and refactoring. This can create team friction and technical debt.

In short: Databricks is exceptional, but the cost model isn’t always transparent, and the rapid feature rollout can introduce operational complexity that teams must actively manage.
What problems is the product solving and how is that benefiting you?
Business : Before adopting Databricks, our aerospace analytics environment — particularly around Customer engine health monitoring — suffered from the same challenges many traditional engineering organisations face.
We had multiple disconnected systems handling telemetry ingestion, fault-code processing, fleet analytics, and maintenance prediction. Data from engine sensors (FADEC, vibration, thermals, oil systems) arrived in different formats and needed heavy manual work just to normalise. Pipelines relied on full overwrites because our legacy setup didn’t support updates or late-arriving data, which made processing slow and expensive.
We struggled with slow ingestion of engine telemetry, inconsistent datasets across engineering teams, and long turnaround times for anomaly detection models.

Architecture challenge: Before using Databricks, we were operating in a fragmented data landscape.
We had multiple systems, disconnected storage layers, and a heavy reliance on overwrite‑based ETL jobs because our old data platform couldn’t support updates, late‑arriving data, or ACID guarantees. This meant pipelines were slow, error‑prone, and expensive. Rolling back bad data could take hours, and data inconsistencies across teams were common.
We struggled with siloed systems, slow pipelines, unreliable data, and high operational cost.

We struggled with manual overwrites and inconsistent data — but now we can use Delta Lake with ACID and time‑travel,
which has resulted in:

Instant rollback from data corruption scenarios
Reliable incremental processing instead of full overwrites
Consistent data consumed across engineering, BI, and ML teams

This reduced our telemetry pipeline processing window from hours to under 30 minutes for a fleet‑wide daily batch..

We struggled with multiple tools and duplicated architectures — but now we have one unified Lakehouse,
which has resulted in:

A single platform for ingestion → transformation → ML → BI → governance
Removal of 3–5 legacy tools (ETL schedulers, BI extracts, legacy ML infra)
Lower maintenance and licensing overhead

We struggled with slow development cycles — but now we can leverage Genie for AI‑assisted engineering,
which has resulted in:

70–80% faster creation of PySpark modules
Automatic generation of schema checks, null checks, and DQ logic
More time spent on decisions, less on boilerplate code

For example, a data quality module that used to take 30 minutes now takes 2–3 minutes to scaffold.

We struggled with inconsistent governance — but now Unity Catalog gives us end‑to‑end visibility,
which has resulted in:

Faster onboarding (reduced from days to minutes)
Centralised permissions, lineage, and audit trails
Stronger compliance alignment

We struggled to scale pipelines and ML workloads — but now we use distributed compute + Photon,
which has resulted in:

Large joins and window operations executing up to 10× faster
Stable handling of terabyte‑scale datasets
Predictable performance even under heavy workloads


    Joseph F.

Databricks Notebooks Make Collaboration Seamless Across Python, SQL, and Scala

  • March 26, 2026
  • Review provided by G2

What do you like best about the product?
Databricks collaborative notebooks are really useful and let me work in whatever language I need to meet my requirements effectively. The ability to mix Python, SQL and even Scala within a dashboard makes collaboration and teamwork much smoothet. I also appreciate how easily it integrates with other tools and cloud platforms, so it fits into my existing workflows without very little friction.
What do you dislike about the product?
I like their customer support and the frequent updates are a big reason this has become my favorite for data management, I also appreciate how well it integrates with external tools like Power BI for reporting its really good.
What problems is the product solving and how is that benefiting you?
Its simplifies cross team collaboration and helps us work through large datasets without having to worry too much about infrastructure or analytics overhead. Calcuations and reporting are fast, which has improved our development cycles and reduced the back and forth between the engineering and analytics teams.


    Ankit K.

Fast, Scalable Spark Processing with a Powerful Unified Analytics Workspace

  • March 12, 2026
  • Review provided by G2

What do you like best about the product?
fast distributed processing with Spark, collaborative notebooks for teams, strong integration with cloud data platforms, scalable data pipelines, unified workspace for data engineering and analytics, handles large datasets efficiently
What do you dislike about the product?
cluster startup time can be slow, costs can increase quickly with heavy workloads, UI can feel complex for new users, debugging distributed jobs is not always straightforward, notebook version control can be tricky
What problems is the product solving and how is that benefiting you?
large-scale data processing, building and managing data pipelines, unified environment for engineering and analytics, faster data transformations, improved scalability for big data workloads