Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

3 AWS reviews

External reviews

19 reviews
from

External reviews are not included in the AWS star rating for the product.


    SlawomirZablocki

Provides seamless integration capabilities, but the cluster management features need improvement

  • July 12, 2024
  • Review provided by PeerSpot

What is our primary use case?

We use the product as a data science platform that enables me to handle and analyze large datasets efficiently.

What is most valuable?

Databricks can switch easily between cloud providers, such as Azure and GCP. It allows seamless integration with various data platforms and cloud providers, facilitating better data handling and analysis.

What needs improvement?

The product could be improved regarding the delay when switching to higher-performing virtual machines compared to other platforms like Snowflake. The ease and speed of managing clusters can also be enhanced, especially when scaling up resources. They could add more advanced data storage solutions like Iceberg and Delta files.

For how long have I used the solution?

I have been using Databricks for approximately two years.

What do I think about the stability of the solution?

I rate the product stability a seven out of ten. 

What do I think about the scalability of the solution?

I rate the product scalability an eight. 

How are customer service and support?

The technical support services are good. 

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup was straightforward. However, configuring policies could have been simpler.

What's my experience with pricing, setup cost, and licensing?

The product pricing is moderate. 

Which other solutions did I evaluate?

I evaluated other options, including Snowflake, before choosing Databricks.

What other advice do I have?

Databricks is a robust solution for big data processing, offering flexibility and powerful features. While there are areas for improvement, especially in performance and cluster management, it remains a highly valuable tool in my data science toolkit.

 I rate it a seven. 


    Dunstan Matekenya

Process large-scale data sets and integrates with Apache Spark with notebook environment

  • July 10, 2024
  • Review provided by PeerSpot

What is our primary use case?

I primarily use Databricks to process large-scale data sets with Apache Spark. My main use case is processing large data sets, such as 600 GB or 800 GB.

What is most valuable?

Databricks integrates natively with Apache Spark, which I use as a processing engine for large-scale datasets. This native integration is one of its strengths. Another strength is that the platform makes it very easy to manage resources. For example, setting up a cluster of five or fifteen nodes is straightforward with Databricks. The notebook environment is also excellent, making it easy to perform various tasks.

What needs improvement?

While Databricks allows you to upload your packages, we encountered some limitations with its capabilities, particularly with Apache Spark, which also affected Databricks. We had issues working with spatial data. You had to go through many steps to find libraries that could process spatial data in a distributed fashion.

For how long have I used the solution?

I have been using Databricks since 2018.

What do I think about the scalability of the solution?

I might have a project that runs for one or two months, and perhaps I won't use it for six months. Self-service is one of its strengths. I can shut down everything and easily spin up resources when I need to use them again.  We have a dedicated group of fifty people who consistently use Databricks for analytics.

How was the initial setup?

The initial setup was very easy and took around 10-15 people. We have a data science infrastructure team helping with this.

What was our ROI?

Databricks stands out among most data platforms mainly because of its ease of use. The learning curve is not as steep, making it accessible for anyone to handle large-scale data processing on Databricks. This ease of use contributes positively to our return on investment. However, in our line of work, converting this efficiency into direct monetary gains can be challenging, given our nonprofit nature. 

What's my experience with pricing, setup cost, and licensing?

We purchased high-performance laptops to reduce our reliance on the cloud. The main issue was the cost. Internally, if I used Databricks, that cost would return to my team. There was a time when my monthly cost was around ten thousand dollars, which was quite high. Due to these costs, several teams, including ours, move away from using Databricks and other cloud providers. It became prohibitive, so we invested in our high-performance computers internally instead.

What other advice do I have?

Databricks provides ease of use for me, particularly due to its seamless integration with Apache Spark. This integration simplifies the process of conducting machine learning on large-scale datasets.

I recommend this solution 100%. Overall, I rate the solution an eight out of ten.


    Dung_Le

Helps users with data processing and analytics

  • July 02, 2024
  • Review provided by PeerSpot

What is our primary use case?

I use Databricks to manage the setting up of data lakes for SaaS.

What needs improvement?

The biggest problem associated with the product is that it is quite pricey. We cannot find a better solution than Databricks in the market currently.

For how long have I used the solution?

I have been using Databricks for a year.

What's my experience with pricing, setup cost, and licensing?

It is an expensive tool. The licensing model is a pay-as-you-go one.

What other advice do I have?

The tool helps with data processing and analytics with large-scale data or big data since it is associated with managing data at a large scale.

For my general use cases, I would say that I am not a technical person, so I cannot explain to you how the tool helps with the area of data engineering tasks.

There is another team in my company that is involved in the use of machine learning and AI features in Databricks. My team is mostly into operations. The tool is used in a multi-country project.

For example, in my company, they make some shopping decisions related to solutions based on what is the product chosen by the whole company.

I rate the tool an eight out of ten.


    Jithin James

Easy to collaborate with other team members who are working on it

  • March 28, 2024
  • Review provided by PeerSpot

What is our primary use case?

We use the solution for reliability engineering, where we apply ML and Deep Learning models to identify the fear failure patterns across different geographies and products.

What is most valuable?

Databricks is hosted on the cloud. It is very easy to collaborate with other team members who are working on it. It is production-ready code, and scheduling the jobs is easy.

What needs improvement?

Databricks would have more collaborative features than it has. It should have some more customization for the jobs. Also, it has an average dashboarding tool. They can bring advanced features so we don't depend on other BI tools to build a dashboard. We are using Tableau to create a dashboard. If Databricks has more advanced features, we can entirely use Databricks.

For how long have I used the solution?

I have been using Databricks for one year.

What do I think about the stability of the solution?

The product is stable. It has been giving consistent outputs without any major issues.

What do I think about the scalability of the solution?

The solution is hosted on the cloud. It supports high scalability features.

10-20 users are using this solution.

How are customer service and support?

There was a training session from Databricks where they explained how to use it. We never had to contact them because they had already given us proper training on the platform.

Which solution did I use previously and why did I switch?

I have used Alteryx before. We switched to Databricks because it can compute and turn your code into production-ready code in very few seconds. Also, the stability is relatively high.

How was the initial setup?

The initial setup is easy.

What about the implementation team?

We have a dedicated team for the deployment.

What other advice do I have?

Delta Lake is a free system. We practically work on the data that we get from Snowflake. Databricks are returned to the model outputs that are returned to Delta Lake. It is easy for us to collaborate using Delta Lake, and the computation speed is also quite high for Delta Lake.

The learning curve for Databricks is not very steep. It's pretty easy, and you will find a lot of materials online. So, if you are comfortable coding in Python, it's very straightforward. There is nothing to worry about when using Databricks.

Overall, I rate the solution a ten out of ten.

Which deployment model are you using for this solution?

Public Cloud


    PraveenS

A scalable and cost-effective solution that has excellent translation features and can be used for data analytics

  • December 13, 2023
  • Review provided by PeerSpot

What is our primary use case?

We use the solution for data analytics of industrial data.

What is most valuable?

We extensively use the product’s notebooks, jobs, and triggers. We can create activities. Wherever translation is required, we use Databricks. The product fulfills our customer requirements. It is a cost-effective solution.

What needs improvement?

The product should provide more advanced features in future releases.

For how long have I used the solution?

I have been using the solution for six months.

What do I think about the stability of the solution?

Our data was not too huge. It worked well. It is easily adaptable.

What do I think about the scalability of the solution?

The tool is scalable. We can make it available for a larger audience.

How was the initial setup?

The initial setup is not that difficult. I rate the ease of setup a seven out of ten. The solution is cloud-based. We use native services like Data Factory for orchestration. Sometimes, the customers require us to use Amazon as the cloud provider instead of Azure.

What's my experience with pricing, setup cost, and licensing?

The pricing is average.

What other advice do I have?

There are many services which are coming up. They are still in the preview stage. Overall, I rate the product an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure


    DevSmita Asthana

Helps to have a good data presence but needs to incorporate learning aspects

  • December 11, 2023
  • Review provided by PeerSpot

What is our primary use case?

The product has helped in data fabrication. 

How has it helped my organization?

Databricks has helped us have a good presence in data. 

What needs improvement?

The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice. 

For how long have I used the solution?

I have been using the product for more than six months. 

What do I think about the stability of the solution?

I rate Databricks' an eight out of ten. 

What do I think about the scalability of the solution?

I rate the tool's scalability an eight out of ten. 

How was the initial setup?

The transition to Databricks was smooth. 

What's my experience with pricing, setup cost, and licensing?

Databricks' price is high. 

What other advice do I have?

I rate the solution a nine out of ten. 


    Nabil Fegaiere1

A powerful solution that is easily integrated into a variety of platforms

  • August 21, 2023
  • Review provided by PeerSpot

What is our primary use case?

I am a Databricks service partner, and my customers use Azure Databricks and Data Factory.

What is most valuable?

It's very simple to use Databricks Apache Spark. It's really good for parallel execution to scale up the workload. In this context, the usage is more about virtual machines.

Using meta-stores like Hive was optional, and the solution is good for data science use cases. With the Authenticator Log, Databricks is good for data transformation and BI usage. We have a platform.

What needs improvement?

I would like more integration with SQL for using data in different workspaces. We use the user interface for some functionalities, while for others, we have to use SQL to create data sets and grant permissions. For example, when creating a cluster, we have to create it with some API or user interface. Creating a cluster with some properties using SQL grants the possibility of using SQL syntax. Integration with SQL will make Databricks easier to use by people who have experience with databases like Lakehouse, and they would be able to use the data lake and BI. More integration will help have one point of view for everyone using SQL syntax.

Integration with Kubernetes could also be good for minimizing the price because you can use Kubernetes instead of virtual machines. But that won't be easy.

For how long have I used the solution?

I have worked with the solution for four or five years, with some experience since 2016.

What do I think about the stability of the solution?

The solution is stable. The only problem with stability would be that people are not using it efficiently.

What do I think about the scalability of the solution?

The solution is good for scalability.

How was the initial setup?

When we have administration experience, the solution is not difficult to deploy. Technically, however, it's difficult because governance is more complex. For example, I have two warehouses on Databricks, which are clusters in this workspace, and we have to switch from workspace to workspace to have all this information. There is a system table that has all this, but I don't know if everyone can use these tables.

What's my experience with pricing, setup cost, and licensing?

Databricks are not costly when compared with other solutions' prices.

Which other solutions did I evaluate?

Databricks's functionalities are as good as solutions like Snowflake, BigQuery, and Redshift.

What other advice do I have?

People sometimes do not use the solution efficiently. They misunderstand databases, the usage of tables, and the performance. Many data engineers are very junior and don't have skills in that. Stability is more a customer problem than a problem with the product itself. One possible problem with the product is that there's no method to pause the usage of something. For example, we have to use the meta server or the data catalog in Synapse. But in Databricks, we have a choice to use a catalog or not, or Hive, which is always integrated, but we have to choose whether to use it or not. Many customers directly use the passes on Databricks, which causes performance and governance problems.

I can offer a lot of advice on Databricks, and one is to use meta stores like Unity Catalog or Hive Metastore. For incoming use cases, it's better to use Unity Catalog.

I rate Databricks a nine out of ten.


    Avadhut Sawant

Ahead of the competition in building data ecosystems, but needs to improve ease-of-use

  • August 16, 2023
  • Review from a verified AWS customer

What is our primary use case?

I worked with Databricks pretty recently. The particular design processes involved in Databricks were also a part of that specific design/architectural process.

We have used the solution for the overall data foundation ecosystem for processing and storage on a Delta format. We have also seen use cases where we were trying to establish advanced analytics models and data sharing where we leverage the Delta Sharing capabilities from Databricks.

What is most valuable?

A very valuable feature is the data processing, and the solution is specifically good at using the Spark ecosystem.

What needs improvement?

There are some aspects of Databricks, like generative AI, where they are positioning things like DALL-E. They're a little bit late to the game, but I think there are some things that they are working on. Generative AI is catching up in areas like data governance and enterprise flavor. Hence, these are places where Databricks has to be faster, and even though they are fast, I'm not sure how they'll catch up and get adopted because there are strong players in the market.

Databricks is coming up with a few good things in terms of integration. But I have to put one point forward that covers multiple aspects, which is the ease of use for the end user while operating this particular tool. For example, a tool like ADS gives you a GUI-based development, which is good for the end user who does development or maintenance. Looking at the complexities of data integration, a GUI might not be easy, but Databricks should embrace something on the graphical user development front because it is currently notebook-driven. Also, in terms of accessing the data for the end user, Databricks has an SQL interface, similar to earlier tools like SQL Management Studio. Since people are mostly comfortable with SSMS already or not, Databricks can build integration to known tools for data access, and that also helps, apart from what they're doing. I would like to see improvements with respect to user enablement, which is a good part of enterprise strategy. I would like to see their integration with a broader ecosystem of products. If you have to do data governance in tools like Microsoft Purview, it's manual and difficult. Now, I'm unsure if that momentum must be from Databricks or Microsoft. But it would be good if Databricks had some open interfaces to share metadata, which could be viewed in tools enabling data governance like Collibra, Purview, or Informatica. The improvement has to do with user and metadata integration for tools.

For how long have I used the solution?

I've worked with Databricks for over five or six years, but it's been on and off.

What do I think about the scalability of the solution?

The solution is scalable. In this particular ecosystem, there is no one else who can catch up with Databricks for now.

How are customer service and support?

Databricks' customer support is very good. They have a lot of ways in which they interact with vendors and service partners across the globe. They have periodic touch-up sessions with vendors, where their engineers answer your questions.

How was the initial setup?

The implementation is not challenging because the solution integrates well with the platforms on which they are established, whether it's Azure, AWS, or GCP. The solution is not difficult to set up, but you'd probably need a technical user to operate it.

It's the same story with maintenance, where you'd need a technically proficient person with programming knowledge to maintain it.

What other advice do I have?

Databricks integrates many enterprise processes because data processing and AIML are a small part of a larger ecosystem. Databricks has been a part of other platforms, and they are trying to establish their platform, which is a good direction.

Most of the capabilities of the underlying platform can be leveraged there. But the setup isn't difficult if the database lacks some capability, you can't find it in the database, or you're not comfortable with a certain feature in the database. It integrates well with the underlying platform. For example, with scheduling, let's say you are uncomfortable with workflow management. You can utilize integrations with EDA for any other tool and probably perform scheduling. Even if what you're trying to do is not easy, it is enabled with integration. Either they build a required feature in their tool later on, like a GUI, or you perform integrations to make the features possible.

We did evaluate licensing costs, but it had more to do with the Azure ecosystem pricing since whatever we are doing has more to do with Azure Databricks. Many optimizations are recommended, but we haven't exercised those for now. But considering that the processing is a bit more efficient, the overall price won't be much different from what it could be for any other similar component or technology. We haven't had specific discussions with Databricks' folks on pricing.

My advice to users who would like to start working with Databricks is that it is a good solution to work with for data integration and machine learning. Databricks is maturing for other use cases, so there are two points to be considered. One is that you need to evaluate how they will mature, which will be on a case-to-case basis. Second, how will it align with the overall platform story? There will be many overlapping aspects over there as Databricks expands its capabilities. In that case, it must be considered that if those capabilities overlap, how will the underlying platform vendors handle it? How would that interplay happen if many of Databricks' new capabilities align with Microsoft Fabric? That has to be very carefully considered. Otherwise, if you utilize those new capabilities, there might be a discontinuity where you cannot use Databricks because the platform does not support that.

If I specifically talk about Spark-based processing transformations, the data integration story, and advanced stability, I would rate Databricks around eight out of ten. However, with respect to new capabilities like cataloging, data governance, and security integration, I rate Databricks around five because it has to establish these features. And since Databricks integrates with platforms, we must see the interplay with the platforms' capabilities.

I overall rate Databricks a seven out of ten.


    Rupal Sharma

Processes large data for data science and data analytics purposes

  • August 15, 2023
  • Review provided by PeerSpot

What is our primary use case?

It's mainly used for data science, data analytics, visualization, and industrial analytics.

What is most valuable?

Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours.

So that's why it's quite convenient to use for data science, for training machine learning models. By using more computing power, you can make it even faster.

What needs improvement?

There is room for improvement in visualization.

For how long have I used the solution?

I used it for two years. I worked with the latest update. 

What do I think about the stability of the solution?

I would rate the stability a nine out of ten. I didn't face performance drops.

What do I think about the scalability of the solution?

I would rate the scalability an eight out of ten.

How are customer service and support?

Databrick's support is great. If we need any support, they are very quick with it. And they genuinely want you to use Databricks. So, whatever we ask them, they come up with multiple solutions to problem statements. That's really good.

Overall, the customer service and support are very good.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I personally prefer using Databricks. However, we also considered using Snowflake, but the pricing was different. It's  price per query.

So, as per your storage, a data scientist or a data analytics team needs to query again and again, which does not suit a data-heavy organization.

What was our ROI?

It's a good return on investment for Databricks from a delivery perspective. Delivered multiple dashboards. So, it's quite a good return on investment. And being a small organization, everyone can use Databricks, and cost-wise, it's also good for small organizations.

Which other solutions did I evaluate?

If the company is a startup, Databricks might be suitable. If a big company needs a lot of storage, Teradata might be best for them. It depends on the situation.

What other advice do I have?

Overall, I would rate the solution a eight out of ten. I would definitely recommend this solution for small organizations. 

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?


    Karan Sharma

An easy to setup tool that provides its users with an insight into the metadata of the data they process

  • August 11, 2023
  • Review provided by PeerSpot

What is our primary use case?

My company uses Databricks to process real-time and batch data with its streaming analytics part. We use Databricks' Unified Data Analytics Platform, for which we have Azure as a solution to bring the unified architecture on top of that to handle the streaming load for our platform.

What is most valuable?

The most valuable feature of the solution stems from the fact that it is quite fast, especially regarding features like its computation and atomicity parts of reading data on any solution. We have a storage account, and we can read the data on the go and use that since we now have the unity catalog in Databricks, which is quite good for giving you an insight into the metadata of the data you're going to process. There are a lot of things that are quite nice with Databricks.

What needs improvement?

Scalability is an area with certain shortcomings. The solution's scalability needs improvement.

For how long have I used the solution?

I have been using Databricks for a few years. I use the solution's latest version. Though currently my company is a user of the solution, we are planning to enter into a partnership with Databricks.

What do I think about the stability of the solution?

It is a stable solution. Stability-wise, I rate the solution an eight to nine out of ten.

What do I think about the scalability of the solution?

It is a scalable solution. Scalability-wise, I rate the solution an eight to nine out of ten.

My company has a team of 50 to 60 people who use the solution.

How are customer service and support?

Sometimes, my company does need support from the technical team of Databricks. The technical team of Databricks has been good and helpful. I rate the technical support an eight out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup phase of Databricks was good. You can spin up clusters and integrate those with DevOps as well. Databricks it's quite nice owing to its user-friendly UI, DPP, and workspaces.

The solution is deployed on the cloud.

The time taken for the deployment depends on the workload.

What's my experience with pricing, setup cost, and licensing?

I cannot judge whether the product is expensive or cheap since I am unaware of the prices of the other products, which are competitors of Databricks. The licensing costs of Databricks depend on how many licenses we need, depending on which Databricks provides a lot of discounts.

What other advice do I have?

It is a state-of-the-art product revolutionizing data analytics and machine learning workspaces. Databricks are a complete solution when it comes to working with data.

I rate the overall product an eight out of ten.