
Databricks Data Intelligence Platform
Databricks, Inc.External reviews
637 reviews
from
and
External reviews are not included in the AWS star rating for the product.
BIA
What do you like best about the product?
Databricks is an excellent tool for data processing and analysis. The platform is user-friendly and intuitive, making it easy for team members of all technical skill levels to collaborate and work on data projects. The integration with popular data storage systems and the ability to run both SQL and Python code make it a versatile option for handling a variety of data types and tasks. The platform also offers robust security features and the ability to scale resources as needed. Overall, I highly recommend Databricks for anyone looking for a reliable and efficient data platform.
What do you dislike about the product?
Nothing. I like the UI and the toggle between python and sql
What problems is the product solving and how is that benefiting you?
Visualization and table is the best for my case
Great All-in-One Platform for data handling
What do you like best about the product?
- Repo deployment allows my team to collaboratively develop against databricks resources while still using their local development toolkit, and quickly deploy out to it when they're ready
- Delta live tables are a breeze to set up and get streaming data into the lakehouse
- Language mixing is very nice; most of my data engineering work is SQL focused, however I can leverage Python or Scala for more complex data manipulation, all within the same notebook
- Delta live tables are a breeze to set up and get streaming data into the lakehouse
- Language mixing is very nice; most of my data engineering work is SQL focused, however I can leverage Python or Scala for more complex data manipulation, all within the same notebook
What do you dislike about the product?
- Data explorer can be incredibly slow and cumbersome if your datalake is unevenly distributed
- Cold starting clusters can take a frustratingly long amount of time, at least for the way our clusters are set up (the minimum size for our cluster options are i3.xlarge on AWS)
- While developing in notebooks is nice, the concept of running notebooks in production where anyone can edit from the ui is concerning, wish there was more ways to "lock" down production processes
- Cold starting clusters can take a frustratingly long amount of time, at least for the way our clusters are set up (the minimum size for our cluster options are i3.xlarge on AWS)
- While developing in notebooks is nice, the concept of running notebooks in production where anyone can edit from the ui is concerning, wish there was more ways to "lock" down production processes
What problems is the product solving and how is that benefiting you?
As a data engineer, databrick has been huge in speeding up my ETL development time, connecting to external databasing and rapidly creating new data objects in a sustainable way
Great way to automate
What do you like best about the product?
I have been actively engaged in Databricks training and I find it very relevant to the work our organization does. We usually have large amounts of data we need to process for our power generation and revenue needs, and I find that Databricks can be a one-stop shop for our automation and streamlining the process.”
What do you dislike about the product?
I believe it could be a steep learning curve for someone who may not know how to program or have a general understanding of it. The best way to work around this is to follow training offered on data bricks.
What problems is the product solving and how is that benefiting you?
We need to build processes around our time-series data for generation and flow. This platform allows us to build quick process and intuitive dashboard which help in quick data processing and workflow setup.
Journey: Delta Lake to Lakehouse
What do you like best about the product?
Databricks' Lakehouse platform combines the capabilities of a data lake and a data warehouse to provide a unified, easy-to-use platform for big data processing and analytics. The platform automatically handles tasks such as data ingestion, data curation, data lineage, and data governance, making it easy to manage and organize large amounts of data. The platform includes features such as version control, collaboration tools, and access controls, making it easy for teams to work together and ensure compliance with data governance policies.
What do you dislike about the product?
The amount of time to spin up a new cluster takes around 10-15 minutes. Moreover, the limited resources and learning materials for new users become challenging. If data bricks can provide more learning resources will be great.
What problems is the product solving and how is that benefiting you?
The platform allows for seamless integration of data from various sources, including structured, semi-structured, and unstructured data, and provides a unified view of all data stored in the lake. The platform includes features such as version control, collaboration tools, and access controls, making it easy for teams to work together and ensure compliance with data governance policies.
Databricks usage for job creation and cluster management and manage spark jobs effectivly.
What do you like best about the product?
Easy to schedule and run jobs and integrate with airflow and azure storage accounts.
Easy to execute code cell-wise and debug the errors because of its interpreter.
Easy to execute code cell-wise and debug the errors because of its interpreter.
What do you dislike about the product?
It won't give auto-fill suggestions while coding like how other IDEA's gives.
What problems is the product solving and how is that benefiting you?
We use for our data engineering projects for large scale datasets.
Great Collaborative Platform for Data Science Projects
What do you like best about the product?
I have been using Databricks platform for business research projects and building ML models for almost a year. It has been a great experience to be able to run analysis and model testing for big data projects in a single platform without switching between SQL server and development environment with Python, R, or Stata. Also, I like the fact that MLflow can track data ingestion for any data shift in realtime for model retraining purposes.
What do you dislike about the product?
We have had issues using MLflow and feature store on Databricks for ML projects, which slows down the development process. Wish there was better documentation on these tools or more diverse examples to demonstrate different use cases. Also, the test-train split with MLflow does not support time series time interval test-train split for model validation purposes.
What problems is the product solving and how is that benefiting you?
The Databricks lakehouse platform allows the data science team better work with the development team in a single platform, which help improve ML project development in the long run.
Good experience so far!
What do you like best about the product?
Great unification of functions & features and data sharing across the organization.
What do you dislike about the product?
There's still a lot to learn and make sure that all the functions I use work well and properly. Nothing bad, just more to find out.
What problems is the product solving and how is that benefiting you?
It's helping me do my job and unifying data sources across all my different work streams.
User friendly and intuitive platform
What do you like best about the product?
As a Cloud Operation Specialist, I deploy the databricks workspace, setup and manage the clusters. It’s easy to setup and manage the users within the workspace.
UI is very user friendly and intuitive.
UI is very user friendly and intuitive.
What do you dislike about the product?
Error messages can me more detailed and explained well.
What problems is the product solving and how is that benefiting you?
Highly efficient in executing queries and analysing data.
Powerful platform
What do you like best about the product?
The platform is powerful and flexible enough to do almost anything you want to do, like ETL, ML models, data mining, simple adhoc queries, etc. Also easy to switch languages between python, sql, r, scala, etc. anytime you want.
What do you dislike about the product?
The search function is not my favoriate, I often like to use the search function from the browser but it doesn't work well with scripts in a big cell. Also the clusters takes a while to start.
What problems is the product solving and how is that benefiting you?
It meets all my data mining and data science project needs. Simple and easy to use.
The most flexible and potent data platform available, without a doubt
What do you like best about the product?
The most reliable and user-friendly option for creating ELT pipelines that employ Python, Spark, and SQL is Databricks. Configuring and deploying it doesn't take much labour, and it frees developers from having to worry about setting up the infrastructure.
What do you dislike about the product?
using the same cluster to perform several streaming tasks
Since shutdown immediately following the job run/fail is configured by default, job clusters cannot be reused even for the same retry in PRODUCTION. Checking potential ways to raise this limit.
Since shutdown immediately following the job run/fail is configured by default, job clusters cannot be reused even for the same retry in PRODUCTION. Checking potential ways to raise this limit.
What problems is the product solving and how is that benefiting you?
comprehensive Batch & streaming pipeline
Alps Lake
History and versioning
Delta log transaction with ACID
Validation and quarantine are methods of data curation.
Information Ingestion Using an Autoloader
Alps Lake
History and versioning
Delta log transaction with ACID
Validation and quarantine are methods of data curation.
Information Ingestion Using an Autoloader
showing 221 - 230