 
                        Databricks Data Intelligence Platform
Databricks, Inc.External reviews
                                
                                640 reviews 
                            
                            from
                            
                                
                                    
                                    
                                    
                                    
                                
                            
                                
                                    
                                     and 
                                    
                                    
                                
                            
                        External reviews are not included in the AWS star rating for the product.
Databricks- Big Data processing tool
What do you like best about the product?
Very easy to use. No need to install and setup spark manually.
provides a notebook environment to write code.
support various languages like Python, Spark-SQL, R, Scala, etc.
easy to set up and use.
you can choose the cluster according to your need.
Support Machine Learning flows and Streaming Data.
Automatic suspend cluster if inactive for more than a given time( Cost-cutting)
Auto scalable Cluster.
Optimize uses of clusters (resources)
provides a notebook environment to write code.
support various languages like Python, Spark-SQL, R, Scala, etc.
easy to set up and use.
you can choose the cluster according to your need.
Support Machine Learning flows and Streaming Data.
Automatic suspend cluster if inactive for more than a given time( Cost-cutting)
Auto scalable Cluster.
Optimize uses of clusters (resources)
What do you dislike about the product?
No CI/ CD features given by default.
Costly for small level Enterprise.
Certification cost is high.
Costly for small level Enterprise.
Certification cost is high.
What problems is the product solving and how is that benefiting you?
We have to develop pipelines. We are getting data from different sources like AWS S3, redshift and we had to process that large amount of data on Databricks and put it back to our Dataware house.
Recommendations to others considering the product:
Splunk is a best tool when it comes to Big data processing. it is easy to use and setup
                        
                            MLFlow: One stop solution for data science model tracking, versioning and deployemet
What do you like best about the product?
1) A single format to support all measure ML libraries such as Sklearn, Tensorflow, MXnet, Spark MLlib, Pyspark etc.
2) Capabilities to deploy on Amazon Sagemaker with just one API call
3) Flexibility to log all model params such as Accuracy, Recall, etc. along with Hyperparameter tuning support.
4) A good GUI to compare and select the best models.
5) Model registry to track Staging, Production, and Archived models.
6) Python best API
7) REST APIs supported.
8) Available out of the box in Microsoft Azure.
2) Capabilities to deploy on Amazon Sagemaker with just one API call
3) Flexibility to log all model params such as Accuracy, Recall, etc. along with Hyperparameter tuning support.
4) A good GUI to compare and select the best models.
5) Model registry to track Staging, Production, and Archived models.
6) Python best API
7) REST APIs supported.
8) Available out of the box in Microsoft Azure.
What do you dislike about the product?
1) CI/CD pipeline is not supported in the open-source version
2) Recent framework so not a very large community
3) Dependent on many python libraries. It can be a problem while resolving dependencies in your existing setup.
2) Recent framework so not a very large community
3) Dependent on many python libraries. It can be a problem while resolving dependencies in your existing setup.
What problems is the product solving and how is that benefiting you?
I have used it for managing the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
The same thing can be done in Amazon sagemaker, GCP AI Platform, Microsoft Azure etc. but it would require monthly expenses. It can be good for initial startup data science team.
The same thing can be done in Amazon sagemaker, GCP AI Platform, Microsoft Azure etc. but it would require monthly expenses. It can be good for initial startup data science team.
Recommendations to others considering the product:
It cant be a complete solution for the data science/ML engineering flow. But is essential in the pipeline. It may be used with Apache Airflow to have an end to end ML ops solution. Also, it works best with Amazon sagemaker and Microsoft Azure. However, GCP AI platform support is still in the development phase. 
You would also need to take care of CI/CD pipeline for ML models on your own.
                        
                            You would also need to take care of CI/CD pipeline for ML models on your own.
Lightening Speed Analytics
What do you like best about the product?
DataBricks is a great analytics tool which provides lightening speed analytics and has given new abilities to Data Scientists. Additionally, our advanced analytics at scale has gone up 100 times.
What do you dislike about the product?
The learning curve is steep and people would need coding knowledge to work with Databricks. It can also be costly at times.
What problems is the product solving and how is that benefiting you?
Problems - Analytics problems
Benefits - Scale and Speed
                        
                            Benefits - Scale and Speed
Great tool for distributed programming
What do you like best about the product?
The different languages used for implementation.
Great user experience.
Easy to understand and use.
Creation of different tools inside such as clusters or database.
Ease of integration with other software such as azure services.
Great addition to your expertise if you manage to master it completely.
Integration of spark with the different languages.(Python, R, Scala)
Great user experience.
Easy to understand and use.
Creation of different tools inside such as clusters or database.
Ease of integration with other software such as azure services.
Great addition to your expertise if you manage to master it completely.
Integration of spark with the different languages.(Python, R, Scala)
What do you dislike about the product?
The documentation inside the portal isn't the best, find better support outside with search engines.
What problems is the product solving and how is that benefiting you?
Currently data transformation as it provides easy access to databases or blobs and the ability to use a language such as python to build up the solution you need is great.
Recommendations to others considering the product:
Great tool for developing when looking for a fast result as it uses distributed programming by the usage of different clusters.
                        
                            Databricks review
What do you like best about the product?
1. Good UI
2. Good integrations with other applications/services.
3. Faster and efficient.
4. Updates are good.
2. Good integrations with other applications/services.
3. Faster and efficient.
4. Updates are good.
What do you dislike about the product?
1. Sometimes it take much time to load the Spark notebook.
2. Sometimes having issues with interpreter settings while running the notebook.
2. Sometimes having issues with interpreter settings while running the notebook.
What problems is the product solving and how is that benefiting you?
1. Big data - Analyzing large datasets.
                        
                            Makes building Spark applications a lot easier
What do you like best about the product?
It's like a Jupyter notebook but a lot more powerful and flexible.  You can easily switch from Python to SQL to Scala from one cell to the next.  With the Spark framework, you can preview your data processing tasks without having to build large intermediate tables.
What do you dislike about the product?
Need better support when it comes to troubleshooting spark applications.  It shows a lot of information, but gives you little sense of how to apply it
What problems is the product solving and how is that benefiting you?
We do a lot of large scale data processing applications.  Previously we used databases, but this is more flexible and powerful (and cheap).
Recommendations to others considering the product:
It's great if you already understand Spark.  Otherwise, Spark has quite a learning curve.
                        
                            Its the Databricks show!
What do you like best about the product?
It has significantly improves its performance with the Databricks Inout and Ouput Module. WIth better support for spark, it combines well with Microsoft Azure and Amazon AWS. It has faster execution and faster read write processes in its version 5.
What do you dislike about the product?
A few schema related queries are still on the slower side considering huge data clusters and the processing involved for those clusters.
What problems is the product solving and how is that benefiting you?
It runs on the clusters of machines managed by Databricks which gives us the assurance to manage data in a distributed manner. It includes Spark and adds a number of components and updates to performa big data analytics and data processing. It's parallel processing in RDD's is amazing. 
                        
                            An easy way to quickly and efficiently analyze data
What do you like best about the product?
I love how accurate and quick databricks is. Once I started working with databricks, I couldn’t fathom doing data analyzation and comparisons without it.
What do you dislike about the product?
The software is not the cheapest on the market and that detracts funds that could go elsewhere throughout the hospital. However, databricks continues to be a great product.
What problems is the product solving and how is that benefiting you?
Multiple forms of data analysis along with demographics and data comparison implementations.
                        
                            Easy to get started with big data
What do you like best about the product?
Helpful online resources. Easy to get started
What do you dislike about the product?
Not enough documentation. Not enough examples
What problems is the product solving and how is that benefiting you?
Predictive modeling.  Easy to spin up models
Recommendations to others considering the product:
Good idea
                        
                            Ease of Implementation
What do you like best about the product?
Overall, since we brought in DataBricks, our ability to use DataScience and advacned analytics at scale has gone up 100 times. Our experience has been awesome, and I know we're not even pushing the bounds of what it can do
What do you dislike about the product?
Overall Databricks has worked well, though it has taken longer than we anticipated to get it up and running.
What problems is the product solving and how is that benefiting you?
Frees up data scientists to do data science instead of fighting with cluster management.
Recommendations to others considering the product:
Provides support and solution that is not available in open source version. Good communication
                        
                            
                    
            showing 281 - 290