Reviews from AWS customer

10 AWS reviews

External reviews

792 reviews
from and

External reviews are not included in the AWS star rating for the product.


    Avinash J.

Efficient Data Scaling with Collaborative Notebooks, but Costly

  • May 14, 2026
  • Review provided by G2

What do you like best about the product?
I use Databricks for ETL workflows and appreciate how it solves the problem of handling massive volumes of data using Apache Spark. Instead of dealing with complex cluster infrastructure manually, Databricks provides a managed environment that helps in scaling. One of the best features is the collaborative notebook environment, which allows cross teams to collaborate effectively. I switched from Snowflake to Databricks mainly because of the massive parallel processing of Spark. Although the initial setup was tough, learning Databricks is easy.
What do you dislike about the product?
Biggest issue is cost management. Initial setup was tough.
What problems is the product solving and how is that benefiting you?
Databricks handles massive data volumes with Apache Spark without manually managing complex cluster infrastructure, providing a scalable managed environment.


    Retail

Consolidated Our Data Stack with Databricks that Boosted Performance and Productivity

  • May 14, 2026
  • Review provided by G2

What do you like best about the product?
Coming from an Airflow + Snowflake setup, moving to Databricks removed a layer of coordination overhead we had normalized, jobs scheduling jobs, reverse ETL pipelines just to get analytical results back into operational systems, and a separate feature store drifting out of sync with training data. The integrations were a big part of why the transition was smoother than expected: native connectors for cloud storage, Git-based repo sync for version-controlled notebooks, and the Databricks SDK plugging cleanly into our existing CI/CD pipelines meant we weren't rebuilding everything from scratch. Databricks Workflows replaced our Airflow DAGs cleanly, Unity Catalog gave us lineage and access control across our full medallion architecture without a separate tool, and Lakebase let us retire the online feature store entirely since model features now live where the data already is. Performance on large-scale aggregations across our brick-and-mortar store datasets improved noticeably, and the workspace UI makes it easy for the whole team to navigate notebooks, pipelines, and catalog without context-switching. The AI-assisted features in the notebook environment genuinely speed up development. The autocomplete and error suggestions that understand the data context are more useful than they sound day-to-day. Onboarding new engineers was also faster than expected given the depth of the platform, with thorough documentation and a responsive support team during migration. From an ROI standpoint, consolidating tooling meant fewer vendor contracts, less pipeline maintenance, and engineering time redirected toward actual product work.
What do you dislike about the product?
The cost model is the most persistent friction point — compute costs can escalate quickly if cluster lifecycle management isn't tight, and for a team that's still maturing its governance around who spins up what, the billing visibility could be more granular out of the box. The UI, while generally clean, gets harder to navigate at scale; when you have dozens of workflows, notebooks, and catalogs, the workspace organization tools don't quite keep up with the sprawl. On the integrations side, some third-party connectors feel like they were added as an afterthought — the experience isn't always as seamless as the native ones, and occasional version compatibility issues have caused unexpected debugging time. Performance on very large unoptimized queries can still surprise you with cold start latency on serverless compute, which matters when you're iterating quickly during development. The AI assistant features are improving but still inconsistent — context awareness drops off on complex multi-file projects and the suggestions occasionally miss the mark in ways that slow you down rather than help. Support response quality has been good for critical issues, but for nuanced technical questions the first response is sometimes generic, and getting to someone with deep product knowledge takes an extra round of escalation.
What problems is the product solving and how is that benefiting you?
The core problem we were solving was operational sprawl — we had analytical data living in one place and operational data in another, with a fleet of pipelines just to keep them in sync. Working with high-volume brick-and-mortar store data across a medallion architecture, the performance gains on large aggregations alone justified the move; queries that previously required careful warehouse sizing now handle gracefully on autoscaling compute. Consolidating onto one platform also meant our AI and ML workflows stopped being second-class citizens — feature engineering, model training, and serving now happen in the same environment where the data lives, which removed an entire category of infrastructure we were maintaining. The workspace UI, while not perfect at scale, made it easier to onboard the broader team without everyone needing deep platform expertise to be productive from day one.


    Praveen M.

Databricks Simplifies Big Data Processing and Team Collaboration

  • May 07, 2026
  • Review provided by G2

What do you like best about the product?
What I like best about Databricks is how it simplifies large-scale data processing and collaboration in one platform. The integration with Spark and cloud service makes handling big data much more efficient. I also like the notebook environment, which makes it easier for teams for works together on analytics and machine learning tasks.
What do you dislike about the product?
One thing I dislike about Databricks is the platform can feel complex for new users, especially when managing clusters and configurations. Pricing can also become expensive with larger workloads if resources are not optimized carefully. While integrations and AI features are powerful, the onboarding process and support documentation could be more beginner-friendly.
What problems is the product solving and how is that benefiting you?
Databricks helps solve the challenge of processing and analyzing large amounts of data efficiently in one platform. It combines data engineering, analytics and AI workflows, which reduce the need for the multiple separate tools. This improves collaboration, speeds up data processing, and helps generate insights much faster.


    Artemij V.

Perfect for Cross-team Collaboration and Intensive Data Applications

  • May 04, 2026
  • Review provided by G2

What do you like best about the product?
The UX is one of the strongest parts. The notebook experience is clean and intuitive, collaboration is straightforward, and moving between exploration, experimentation, and production workflows feels seamless. It has enough flexibility for advanced users while still being approachable enough that onboarding new team members is fast. People can usually become productive quickly without spending weeks learning platform-specific quirks.

The integrations are also excellent. It works smoothly with the broader cloud ecosystem and connects well with data sources, orchestration tools, model serving infrastructure, and external systems. That interoperability makes it much easier to move from prototype to deployed pipeline without constantly rebuilding connectors or managing glue code.

Performance has been consistently strong, especially when working with distributed workloads and large-scale feature engineering. Spark optimization, cluster management, and managed infrastructure significantly reduce operational overhead, which lets me focus more on model development and analysis rather than environment tuning. For iterative experimentation, spin-up times and overall responsiveness are noticeably better than many alternative managed platforms.
What do you dislike about the product?
One area where Databricks could improve is pricing. The platform delivers strong capabilities, but costs can escalate quickly for high-frequency or real-time workloads. For use cases involving continuously running low-latency tick pipelines, streaming market data, or iterative model retraining, the pricing can become fairly steep relative to the infrastructure being consumed. It sometimes feels like there’s a meaningful premium for convenience and managed orchestration, which can make cost optimization a constant consideration.

The AI integration is another area that still feels somewhat uneven. While there’s a clear push toward positioning the platform as an end-to-end AI/ML environment, some of the newer AI-focused features feel more like ecosystem additions than deeply integrated workflow improvements. In practice, there are still cases where custom tooling or external frameworks provide more flexibility and transparency, particularly for specialized model development, experimentation, and real-time inference use cases.

There can also be some complexity around tuning clusters and managing costs efficiently at scale. While the abstractions are helpful, getting the best performance-to-cost ratio sometimes requires deeper platform knowledge than the “fully managed” positioning might imply.

Overall, the platform is very strong technically, but pricing for always-on data-intensive workloads and the maturity of some AI-native capabilities are the two biggest areas where I’d like to see improvement.
What problems is the product solving and how is that benefiting you?
Databricks solves one of the biggest challenges in modern data work: bringing together data access, large-scale processing, and collaborative development in a single environment.

For my work, the biggest benefit is real-time collaboration. It allows multiple people to work against the same datasets, notebooks, and pipelines without the usual friction of fragmented tooling or environment inconsistencies. That significantly speeds up experimentation, iteration, and knowledge sharing across projects, especially when moving quickly on model development or analyzing fast-changing data.

It also solves the challenge of scalable data access and processing. Working with high-volume time-series and transactional datasets requires infrastructure that can process large amounts of data efficiently without constant operational overhead. Databricks abstracts much of that complexity, making it possible to focus on analysis, feature engineering, and model development rather than spending time managing infrastructure.

The practical benefit is faster iteration cycles. I can move from raw data exploration to model experimentation and deployment much more quickly, which is especially valuable when working on real-time analytics, forecasting pipelines, and production-facing ML systems where speed of iteration directly impacts outcomes.

Overall, it reduces engineering friction and makes large-scale collaborative data work significantly more efficient, which translates into faster development, better experimentation, and more reliable deployment of data products.


    Computer Software

Straightforward SQL, Smooth Workflow Scheduling, and a Handy Notebook Feature

  • May 02, 2026
  • Review provided by G2

What do you like best about the product?
It’s straightforward to write and run SQL, schedule workflows, and I especially like the notebook feature. Genie AI is helpful for diagnosing bugs, and it can also answer ad hoc questions whenever I need it.
What do you dislike about the product?
Genie’s AI feature could still use some improvement. It sometimes takes a long time to respond, and with more complex problems it doesn’t always handle them well.
What problems is the product solving and how is that benefiting you?
The workflow is very easy to schedule. It’s also simple to set up alerts, and the visualization makes it easy for me to modify and debug.


    Shreeram P.

Solves Developers’ Problems with Genie, Lakeflow Connect, and DLT

  • April 30, 2026
  • Review provided by G2

What do you like best about the product?
This platform solves developers’ problems by offering features like Genie, Lakeflow Connect, and DLT.
What do you dislike about the product?
Before using it, I want to understand the compute and charges, and how to use it properly. Basically, I need to learn a lot first.
What problems is the product solving and how is that benefiting you?
It solved our data pipeline and dashboard creation challenges. With SDP and AI/BI Genie, we moved from manually managing the data pipeline to simply declaring it in SQL and having everything handled for us. Instead of spending so much time building dashboards, we can now just ask questions in natural language and get the answers we need without wasting a lot of time.


    Arif V.

Intuitive UI and AI-Powered Experience That Keeps Getting Better

  • April 29, 2026
  • Review provided by G2

What do you like best about the product?
The UI is pretty intuitive and they are using ai to make the experience even better
What do you dislike about the product?
For the most part, it’s a great platform, but some of the debugging options could be improved.
What problems is the product solving and how is that benefiting you?
I use it to write queries for extracting data and running experiments, mostly with SQL and Python.


    Antonio V.

Scalable, All-in-One Environment with Some Learning Curve

  • April 28, 2026
  • Review provided by G2

What do you like best about the product?
I like Databricks for its scalability and all-in-one environment for data engineering, analytics, and machine learning. It allows me to process large datasets efficiently while keeping workflows organized in one platform. The scalability is very valuable because it lets me handle growing data volumes and complex workloads without performance issues. As projects expand, the platform can scale resources efficiently.
What do you dislike about the product?
Some features can have a learning curve, especially for new users working with advanced configurations or cluster management. The interface could also be more intuitive in certain areas. The setup was relatively smooth for core features, but some advanced settings like cluster optimization, permissions, and integrations required more time and technical knowledge.
What problems is the product solving and how is that benefiting you?
Databricks solves major data management and analytics challenges by efficiently handling large datasets, simplifying ETL processes, and centralizing workflows. Its scalability allows me to manage growing data volumes without performance issues, ensuring resources scale efficiently as projects expand.


    Simran S.

Unifies Data Engineering, ML & Analytics with Ease

  • April 27, 2026
  • Review provided by G2

What do you like best about the product?
I like how Databricks brings data engineering, analytics, and machine learning into a simple unified platform. I appreciate the faster end-to-end data flow, the single source of truth it provides, and the better collaboration it enables. I also found the initial setup to be quite simple.
What do you dislike about the product?
Cost management is a concern for me. Being a scalable and compute-heavy platform, costs can increase quickly.
What problems is the product solving and how is that benefiting you?
I use Databricks for data processing and engineering, handling large data volumes and eliminating data silos. It unifies data engineering, analytics, and machine learning for faster data flow, providing a single source of truth and improving collaboration.


    4 Dhiraj B.

Fast, Reliable Lakehouse That Unifies Data Processing and SQL

  • April 25, 2026
  • Review provided by G2

What do you like best about the product?
Data processing and SQL queries are fast and reliable, and the lakehouse platform really helps unify our work.
What do you dislike about the product?
There’s a steep learning curve that demands advanced coding skills and solid Spark expertise, which can make it feel like overkill for teams that only need straightforward SQL reporting.
What problems is the product solving and how is that benefiting you?
We’ve improved our data team’s productivity by removing the need for manual infrastructure management.