Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

6 AWS reviews

External reviews

636 reviews
from and

External reviews are not included in the AWS star rating for the product.


3-star reviews ( Show all reviews )

    Parag Bhosale

Integrating engineering and learning, but cost challenges arise with cluster management

  • January 08, 2025
  • Review provided by PeerSpot

What is our primary use case?

I usually handle data ingestion and create warehouses. I also assist other teams, such as analytics, to create reports or perform other tasks.

What is most valuable?

Having one solution for everything, from data engineering to machine learning, is beneficial since everything comes under one hood.

What needs improvement?

We often use a single cluster to ingest Databricks, which Databricks doesn't recommend. They suggest using a no-cluster solution like job clusters. This can be overwhelming for us because we started smaller. 

We prefer using a small to mid-sized cluster for many jobs to keep costs low, but this sometimes doesn't support our operations properly. We need to stay in sync with the DVR versions, and migrations can pose challenges. For example, issues arose when we moved a cluster from a previous version to the latest one. We could use their job clusters, however, that increases costs, which is challenging for us as a startup. Maintaining this infrastructure can be a headache.

For how long have I used the solution?

I have worked at a couple of companies, not just the current one, and I have about 20 to 25 months of experience with Databricks.

What do I think about the stability of the solution?

They release patches that sometimes break our code. These patches are supposed to fix issues, but sometimes they cause disruptions.

What do I think about the scalability of the solution?

The patches have sometimes caused issues leading to our jobs being paused for about six hours. Fortunately, nothing important is currently running on Databricks, however, if there were, it would be a significant issue.

How are customer service and support?

They are good. My company has a contract with them that includes good support. Whenever we reach out, they respond promptly.

How would you rate customer service and support?

Neutral

What was our ROI?

With the benefits we receive, the price is reasonable. However, it's important to have good use cases. If it's just for data ingestion, it might not be the best solution price-wise. For a lot of different tasks, including machine learning, it is a nice solution.

What other advice do I have?

I would rate the solution seven out of ten. That rating also depends on how we have the contract with Databricks. 

It's still a solid and good rating. I work as a data engineer and Databricks engineer. 

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?


    Siddhesh S.

Powerful and Intuitive

  • September 12, 2024
  • Review provided by G2

What do you like best about the product?
Notebook UI is easy use and debug while providing single line code runs,According to my use it works well API sources as well as on premises SSMS sources,multiple source integration is provided as well as some easy to read and write code works well.Also introduction of foreign Catalog has made it easier to implement different sources on cloud
What do you dislike about the product?
Concurrent Updates doesn't work makes it a pain to update single table from multiple threads
What problems is the product solving and how is that benefiting you?
It is optimizing API Calls, file retrievals, Data reads and Data Storage of Large tables in existing on premises Databases.
Reduces Job time to perform ETL on the Data Tables.


    SlawomirZablocki

Provides seamless integration capabilities, but the cluster management features need improvement

  • July 12, 2024
  • Review provided by PeerSpot

What is our primary use case?

We use the product as a data science platform that enables me to handle and analyze large datasets efficiently.

What is most valuable?

Databricks can switch easily between cloud providers, such as Azure and GCP. It allows seamless integration with various data platforms and cloud providers, facilitating better data handling and analysis.

What needs improvement?

The product could be improved regarding the delay when switching to higher-performing virtual machines compared to other platforms like Snowflake. The ease and speed of managing clusters can also be enhanced, especially when scaling up resources. They could add more advanced data storage solutions like Iceberg and Delta files.

For how long have I used the solution?

I have been using Databricks for approximately two years.

What do I think about the stability of the solution?

I rate the product stability a seven out of ten. 

What do I think about the scalability of the solution?

I rate the product scalability an eight. 

How are customer service and support?

The technical support services are good. 

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup was straightforward. However, configuring policies could have been simpler.

What's my experience with pricing, setup cost, and licensing?

The product pricing is moderate. 

Which other solutions did I evaluate?

I evaluated other options, including Snowflake, before choosing Databricks.

What other advice do I have?

Databricks is a robust solution for big data processing, offering flexibility and powerful features. While there are areas for improvement, especially in performance and cluster management, it remains a highly valuable tool in my data science toolkit.

 I rate it a seven. 


    Manufacturing

Manager Data Science

  • June 10, 2024
  • Review provided by G2

What do you like best about the product?
Unified platform with lots of capabilities, open source based. No vendor locked.
What do you dislike about the product?
Appears to have a learning curve. Not now-no code environment on par with PowerBI.
What problems is the product solving and how is that benefiting you?
Able to manage large-scale data and easily upkeep the pipelines for it.


    Good but can be better

Onboarding can be smoother

  • November 30, 2023
  • Review from a verified AWS customer

The onboarding process is not smooth. When account setup begins, theere is no way to move to a new email if previous one has not yet been activated. Also no way to know which email was used to setup the subscription sign up.


    Information Technology and Services

Had a great impressive experience

  • November 09, 2023
  • Review provided by G2

What do you like best about the product?
Totally impressive with Delta live tables, and created a poc with its and its really impressive with ease of use and high efficient for large datset.
What do you dislike about the product?
Azure data factory integration is not available to trigger the Delta live tables
What problems is the product solving and how is that benefiting you?
Effective in handling incremntal loads


    Health, Wellness and Fitness

Best product for both datalake and data warehouse reduce the cost and faster deliver the data

  • October 25, 2023
  • Review provided by G2

What do you like best about the product?
Best product for both datalake and data warehouse
cost reduce
What do you dislike about the product?
logging is not good
integration to visual is bit complex
What problems is the product solving and how is that benefiting you?
data distribution on big data


    Abhilash E.

Unified analytics platform

  • October 17, 2023
  • Review provided by G2

What do you like best about the product?
ACID tansaction support to delta lake.
it is a platform for both data engineering and data science.
flexibility with different tpe of data.
scalability and performance.
Integration with cloud services
collabration features
warehousing for real time or nearly real time data
What do you dislike about the product?
Cost will be the primary concern, that can make serveral firms to go for other optins.
maintenance for the complex deployments
What problems is the product solving and how is that benefiting you?
Impoved performance and scalability in processing the data.
Easier adoption of advance analytics tools.
Helps in Real Time Data processing.
Adopting ACID properties into lakehouse helps in data quality and readability.


    Amulya S.

Databricks: a perfect data platform for python users

  • October 05, 2023
  • Review provided by G2

What do you like best about the product?
The UI is build keeping ML and python users in mind, it is very intuitive to use.
What do you dislike about the product?
The speed of processing is slow and could be improved
What problems is the product solving and how is that benefiting you?
Easy integration with python notebooks and big data. The pipeline got much efficient.


    Avadhut Sawant

Ahead of the competition in building data ecosystems, but needs to improve ease-of-use

  • August 16, 2023
  • Review from a verified AWS customer

What is our primary use case?

I worked with Databricks pretty recently. The particular design processes involved in Databricks were also a part of that specific design/architectural process.

We have used the solution for the overall data foundation ecosystem for processing and storage on a Delta format. We have also seen use cases where we were trying to establish advanced analytics models and data sharing where we leverage the Delta Sharing capabilities from Databricks.

What is most valuable?

A very valuable feature is the data processing, and the solution is specifically good at using the Spark ecosystem.

What needs improvement?

There are some aspects of Databricks, like generative AI, where they are positioning things like DALL-E. They're a little bit late to the game, but I think there are some things that they are working on. Generative AI is catching up in areas like data governance and enterprise flavor. Hence, these are places where Databricks has to be faster, and even though they are fast, I'm not sure how they'll catch up and get adopted because there are strong players in the market.

Databricks is coming up with a few good things in terms of integration. But I have to put one point forward that covers multiple aspects, which is the ease of use for the end user while operating this particular tool. For example, a tool like ADS gives you a GUI-based development, which is good for the end user who does development or maintenance. Looking at the complexities of data integration, a GUI might not be easy, but Databricks should embrace something on the graphical user development front because it is currently notebook-driven. Also, in terms of accessing the data for the end user, Databricks has an SQL interface, similar to earlier tools like SQL Management Studio. Since people are mostly comfortable with SSMS already or not, Databricks can build integration to known tools for data access, and that also helps, apart from what they're doing. I would like to see improvements with respect to user enablement, which is a good part of enterprise strategy. I would like to see their integration with a broader ecosystem of products. If you have to do data governance in tools like Microsoft Purview, it's manual and difficult. Now, I'm unsure if that momentum must be from Databricks or Microsoft. But it would be good if Databricks had some open interfaces to share metadata, which could be viewed in tools enabling data governance like Collibra, Purview, or Informatica. The improvement has to do with user and metadata integration for tools.

For how long have I used the solution?

I've worked with Databricks for over five or six years, but it's been on and off.

What do I think about the scalability of the solution?

The solution is scalable. In this particular ecosystem, there is no one else who can catch up with Databricks for now.

How are customer service and support?

Databricks' customer support is very good. They have a lot of ways in which they interact with vendors and service partners across the globe. They have periodic touch-up sessions with vendors, where their engineers answer your questions.

How was the initial setup?

The implementation is not challenging because the solution integrates well with the platforms on which they are established, whether it's Azure, AWS, or GCP. The solution is not difficult to set up, but you'd probably need a technical user to operate it.

It's the same story with maintenance, where you'd need a technically proficient person with programming knowledge to maintain it.

What other advice do I have?

Databricks integrates many enterprise processes because data processing and AIML are a small part of a larger ecosystem. Databricks has been a part of other platforms, and they are trying to establish their platform, which is a good direction.

Most of the capabilities of the underlying platform can be leveraged there. But the setup isn't difficult if the database lacks some capability, you can't find it in the database, or you're not comfortable with a certain feature in the database. It integrates well with the underlying platform. For example, with scheduling, let's say you are uncomfortable with workflow management. You can utilize integrations with EDA for any other tool and probably perform scheduling. Even if what you're trying to do is not easy, it is enabled with integration. Either they build a required feature in their tool later on, like a GUI, or you perform integrations to make the features possible.

We did evaluate licensing costs, but it had more to do with the Azure ecosystem pricing since whatever we are doing has more to do with Azure Databricks. Many optimizations are recommended, but we haven't exercised those for now. But considering that the processing is a bit more efficient, the overall price won't be much different from what it could be for any other similar component or technology. We haven't had specific discussions with Databricks' folks on pricing.

My advice to users who would like to start working with Databricks is that it is a good solution to work with for data integration and machine learning. Databricks is maturing for other use cases, so there are two points to be considered. One is that you need to evaluate how they will mature, which will be on a case-to-case basis. Second, how will it align with the overall platform story? There will be many overlapping aspects over there as Databricks expands its capabilities. In that case, it must be considered that if those capabilities overlap, how will the underlying platform vendors handle it? How would that interplay happen if many of Databricks' new capabilities align with Microsoft Fabric? That has to be very carefully considered. Otherwise, if you utilize those new capabilities, there might be a discontinuity where you cannot use Databricks because the platform does not support that.

If I specifically talk about Spark-based processing transformations, the data integration story, and advanced stability, I would rate Databricks around eight out of ten. However, with respect to new capabilities like cataloging, data governance, and security integration, I rate Databricks around five because it has to establish these features. And since Databricks integrates with platforms, we must see the interplay with the platforms' capabilities.

I overall rate Databricks a seven out of ten.