
Databricks Data Intelligence Platform
Databricks, Inc.External reviews
637 reviews
from
and
External reviews are not included in the AWS star rating for the product.
Great experience. It helped us create/deploy custom models on sagemaker.
What do you like best about the product?
The features. There are hell lot of features present on MLFlow
What do you dislike about the product?
Haven't explored everything. But yeah, maybe better documentation.
What problems is the product solving and how is that benefiting you?
Solving platform issues of aws sagemaker. We deployed a custom model which used to power ranking algorithm on the listing page.
Good for code collaboration
What do you like best about the product?
Multiple people can write code in the same file at the same time. We use Databricks in Machine Learning.
What do you dislike about the product?
The cluster gets shut down after sometimes, leading to loss of data on the RAM
What problems is the product solving and how is that benefiting you?
Code collaboration, Machine Learning
It was fantastic for all the integrations it had. Custom ml transform support could have been better
What do you like best about the product?
The fact that you can store all your models at one place
What do you dislike about the product?
The support for custom transforms isn't there
What problems is the product solving and how is that benefiting you?
We wanted a place to store all models at one place with versioning. Many benefits like serving the model are offered out of the box through databricks
Recommendations to others considering the product:
If you want to store all your ml models in one place then it's the way to go.
Databricks is the best option for your data workloads and pipelines
What do you like best about the product?
It is a highly adaptable solution for data engineering, data science, and AI
What do you dislike about the product?
I wouldn't say I like the lack of an easier way to import personalized code files or libraries from notebooks.
What problems is the product solving and how is that benefiting you?
I've solved emergency telephone data processing and insights. The performance of the solution is desirable.
Senior Cloud Evangelist and Architect
What do you like best about the product?
Spark Distribution of query and speed of batch query so does performance
What do you dislike about the product?
Interface can be make better and more intutive
What problems is the product solving and how is that benefiting you?
Big Batch bulk Parallel programming
Staging data for insights
What do you like best about the product?
It makes the power of Spark accessible and innovative solutions like Delta Lake.
What do you dislike about the product?
Fewer solutions that aren't wholly or partially on the cloud.
What problems is the product solving and how is that benefiting you?
We are staging large datasets for reporting and multiple BI solutions.
Best tool for big data
What do you like best about the product?
Easy to use multiple languages based command in same notebook. Direct connection to Redshift.
What do you dislike about the product?
Sometime it takes lot of time to load data. Should show better suggestions.
What problems is the product solving and how is that benefiting you?
We are using databricks to analyse big data and get business insights.
Very powerful yet easy to use distributed computing and data warehousing platform
What do you like best about the product?
Databricks had very powerful distributed computing built in with easy to deploy optimized clusters for spark computations. The notebooks with MLFlow integration makes it easy to use for Analytics and Data Science team yet the underlying APIs and CICD integrations make it very customizable for the Data Engineers to create complex automated data pipelines. Ability to store and query and manipulate massive Spark SQL tables with ACID in Delta Lake makes big data easily accessible to all in the organization.
What do you dislike about the product?
It lacks built in data backup features and ability to restrict data access to specific users. So if anyone accidentally deletes data from Delta Table or DBFS, the lost data cannot be retrieved unless we setup our own customized backup solution.
What problems is the product solving and how is that benefiting you?
I have worked with big data with hundreds of millions of rows using databricks. We do most of the ELT, data cleaning and prepping works on databricks. The ease and speed of querying bid data using databricks SparkSQL is very useful. It is also very easy to create prototype codes utilizing real sized data using the available Python and R notebooks.
Databricks- Big Data processing tool
What do you like best about the product?
Very easy to use. No need to install and setup spark manually.
provides a notebook environment to write code.
support various languages like Python, Spark-SQL, R, Scala, etc.
easy to set up and use.
you can choose the cluster according to your need.
Support Machine Learning flows and Streaming Data.
Automatic suspend cluster if inactive for more than a given time( Cost-cutting)
Auto scalable Cluster.
Optimize uses of clusters (resources)
provides a notebook environment to write code.
support various languages like Python, Spark-SQL, R, Scala, etc.
easy to set up and use.
you can choose the cluster according to your need.
Support Machine Learning flows and Streaming Data.
Automatic suspend cluster if inactive for more than a given time( Cost-cutting)
Auto scalable Cluster.
Optimize uses of clusters (resources)
What do you dislike about the product?
No CI/ CD features given by default.
Costly for small level Enterprise.
Certification cost is high.
Costly for small level Enterprise.
Certification cost is high.
What problems is the product solving and how is that benefiting you?
We have to develop pipelines. We are getting data from different sources like AWS S3, redshift and we had to process that large amount of data on Databricks and put it back to our Dataware house.
Recommendations to others considering the product:
Splunk is a best tool when it comes to Big data processing. it is easy to use and setup
MLFlow: One stop solution for data science model tracking, versioning and deployemet
What do you like best about the product?
1) A single format to support all measure ML libraries such as Sklearn, Tensorflow, MXnet, Spark MLlib, Pyspark etc.
2) Capabilities to deploy on Amazon Sagemaker with just one API call
3) Flexibility to log all model params such as Accuracy, Recall, etc. along with Hyperparameter tuning support.
4) A good GUI to compare and select the best models.
5) Model registry to track Staging, Production, and Archived models.
6) Python best API
7) REST APIs supported.
8) Available out of the box in Microsoft Azure.
2) Capabilities to deploy on Amazon Sagemaker with just one API call
3) Flexibility to log all model params such as Accuracy, Recall, etc. along with Hyperparameter tuning support.
4) A good GUI to compare and select the best models.
5) Model registry to track Staging, Production, and Archived models.
6) Python best API
7) REST APIs supported.
8) Available out of the box in Microsoft Azure.
What do you dislike about the product?
1) CI/CD pipeline is not supported in the open-source version
2) Recent framework so not a very large community
3) Dependent on many python libraries. It can be a problem while resolving dependencies in your existing setup.
2) Recent framework so not a very large community
3) Dependent on many python libraries. It can be a problem while resolving dependencies in your existing setup.
What problems is the product solving and how is that benefiting you?
I have used it for managing the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
The same thing can be done in Amazon sagemaker, GCP AI Platform, Microsoft Azure etc. but it would require monthly expenses. It can be good for initial startup data science team.
The same thing can be done in Amazon sagemaker, GCP AI Platform, Microsoft Azure etc. but it would require monthly expenses. It can be good for initial startup data science team.
Recommendations to others considering the product:
It cant be a complete solution for the data science/ML engineering flow. But is essential in the pipeline. It may be used with Apache Airflow to have an end to end ML ops solution. Also, it works best with Amazon sagemaker and Microsoft Azure. However, GCP AI platform support is still in the development phase.
You would also need to take care of CI/CD pipeline for ML models on your own.
You would also need to take care of CI/CD pipeline for ML models on your own.
showing 271 - 280