Databricks Data Intelligence Platform
Databricks, Inc.External reviews
640 reviews
from
and
External reviews are not included in the AWS star rating for the product.
Unified Platform & Collaborative Workspace for Data & AI/ML team
What do you like best about the product?
Databricks Serverless SQL with Photon Query acceleration for data analyst & business analyst
In-built Visualization & dashboards, along with GeoSaptial & Advanced SQL functions
Unified Pipeline for Structure streaming batch & real-time ingestion
Auto-loader for standard formats of file ingestion & Schema Evolution in-built
Delta Live Table for data Engineering Workloads & Pipelines
Databricks Multi-task Orchestration job worklfows
Unity Catalog Metstaore & its integration with other data catalogs
MLFlow for building and tracking ML experiments & Feature Store for centralized feature supply for production/inference models
Time Travel & Z-order Optimization
In-built Visualization & dashboards, along with GeoSaptial & Advanced SQL functions
Unified Pipeline for Structure streaming batch & real-time ingestion
Auto-loader for standard formats of file ingestion & Schema Evolution in-built
Delta Live Table for data Engineering Workloads & Pipelines
Databricks Multi-task Orchestration job worklfows
Unity Catalog Metstaore & its integration with other data catalogs
MLFlow for building and tracking ML experiments & Feature Store for centralized feature supply for production/inference models
Time Travel & Z-order Optimization
What do you dislike about the product?
Need to build a more comprehensive orchestration workflow JOBS panel for a diverse set of pattern design workflows
Serverless Cluster for Data Engineering Streaming/Batch pipelines
Integrate most IDE features into the notebook
Clear documentation on Custom Databricks runtime docker image creation will be helpful
Lineage & flow monitoring dashboard can be built automated for non-DLT jobs as well
DLT implementation can be extended to other DELTA format supporting warehouse in future
Serverless Cluster for Data Engineering Streaming/Batch pipelines
Integrate most IDE features into the notebook
Clear documentation on Custom Databricks runtime docker image creation will be helpful
Lineage & flow monitoring dashboard can be built automated for non-DLT jobs as well
DLT implementation can be extended to other DELTA format supporting warehouse in future
What problems is the product solving and how is that benefiting you?
Unified Pipeline for Structure streaming batch & real-time ingestion
The schema merge feature helps to track the change in Schema
DLT feature helps to build Data Quality Lineage along with automated Pipeline links to the reference LIVE tables
Auto-loader helps to build the common ingestion framework for our enterprise
The schema merge feature helps to track the change in Schema
DLT feature helps to build Data Quality Lineage along with automated Pipeline links to the reference LIVE tables
Auto-loader helps to build the common ingestion framework for our enterprise
Solid Data Platform
What do you like best about the product?
Fast iterative abilities and notebook baseed UI. It is also helpful to have multiple contributors on a single notebook at one time. You can see where others are in the notebook which helps with collaboration.
What do you dislike about the product?
The UI frequently exhibits unintended behavior. I will occasionally have random characters added to random cells in the notebook causing errors. It makes debugging difficult when you made no changes and a working cell is now causing errors.
What problems is the product solving and how is that benefiting you?
We are using Databricks to move large amounts of data. Our team is able to run different ETL pipelines with different schedules in an organized way. We are able to quickly iterate on our notebooks to add new features.
BIA
What do you like best about the product?
Databricks is an excellent tool for data processing and analysis. The platform is user-friendly and intuitive, making it easy for team members of all technical skill levels to collaborate and work on data projects. The integration with popular data storage systems and the ability to run both SQL and Python code make it a versatile option for handling a variety of data types and tasks. The platform also offers robust security features and the ability to scale resources as needed. Overall, I highly recommend Databricks for anyone looking for a reliable and efficient data platform.
What do you dislike about the product?
Nothing. I like the UI and the toggle between python and sql
What problems is the product solving and how is that benefiting you?
Visualization and table is the best for my case
Great All-in-One Platform for data handling
What do you like best about the product?
- Repo deployment allows my team to collaboratively develop against databricks resources while still using their local development toolkit, and quickly deploy out to it when they're ready
- Delta live tables are a breeze to set up and get streaming data into the lakehouse
- Language mixing is very nice; most of my data engineering work is SQL focused, however I can leverage Python or Scala for more complex data manipulation, all within the same notebook
- Delta live tables are a breeze to set up and get streaming data into the lakehouse
- Language mixing is very nice; most of my data engineering work is SQL focused, however I can leverage Python or Scala for more complex data manipulation, all within the same notebook
What do you dislike about the product?
- Data explorer can be incredibly slow and cumbersome if your datalake is unevenly distributed
- Cold starting clusters can take a frustratingly long amount of time, at least for the way our clusters are set up (the minimum size for our cluster options are i3.xlarge on AWS)
- While developing in notebooks is nice, the concept of running notebooks in production where anyone can edit from the ui is concerning, wish there was more ways to "lock" down production processes
- Cold starting clusters can take a frustratingly long amount of time, at least for the way our clusters are set up (the minimum size for our cluster options are i3.xlarge on AWS)
- While developing in notebooks is nice, the concept of running notebooks in production where anyone can edit from the ui is concerning, wish there was more ways to "lock" down production processes
What problems is the product solving and how is that benefiting you?
As a data engineer, databrick has been huge in speeding up my ETL development time, connecting to external databasing and rapidly creating new data objects in a sustainable way
Great way to automate
What do you like best about the product?
I have been actively engaged in Databricks training and I find it very relevant to the work our organization does. We usually have large amounts of data we need to process for our power generation and revenue needs, and I find that Databricks can be a one-stop shop for our automation and streamlining the process.”
What do you dislike about the product?
I believe it could be a steep learning curve for someone who may not know how to program or have a general understanding of it. The best way to work around this is to follow training offered on data bricks.
What problems is the product solving and how is that benefiting you?
We need to build processes around our time-series data for generation and flow. This platform allows us to build quick process and intuitive dashboard which help in quick data processing and workflow setup.
Journey: Delta Lake to Lakehouse
What do you like best about the product?
Databricks' Lakehouse platform combines the capabilities of a data lake and a data warehouse to provide a unified, easy-to-use platform for big data processing and analytics. The platform automatically handles tasks such as data ingestion, data curation, data lineage, and data governance, making it easy to manage and organize large amounts of data. The platform includes features such as version control, collaboration tools, and access controls, making it easy for teams to work together and ensure compliance with data governance policies.
What do you dislike about the product?
The amount of time to spin up a new cluster takes around 10-15 minutes. Moreover, the limited resources and learning materials for new users become challenging. If data bricks can provide more learning resources will be great.
What problems is the product solving and how is that benefiting you?
The platform allows for seamless integration of data from various sources, including structured, semi-structured, and unstructured data, and provides a unified view of all data stored in the lake. The platform includes features such as version control, collaboration tools, and access controls, making it easy for teams to work together and ensure compliance with data governance policies.
Databricks usage for job creation and cluster management and manage spark jobs effectivly.
What do you like best about the product?
Easy to schedule and run jobs and integrate with airflow and azure storage accounts.
Easy to execute code cell-wise and debug the errors because of its interpreter.
Easy to execute code cell-wise and debug the errors because of its interpreter.
What do you dislike about the product?
It won't give auto-fill suggestions while coding like how other IDEA's gives.
What problems is the product solving and how is that benefiting you?
We use for our data engineering projects for large scale datasets.
Great Collaborative Platform for Data Science Projects
What do you like best about the product?
I have been using Databricks platform for business research projects and building ML models for almost a year. It has been a great experience to be able to run analysis and model testing for big data projects in a single platform without switching between SQL server and development environment with Python, R, or Stata. Also, I like the fact that MLflow can track data ingestion for any data shift in realtime for model retraining purposes.
What do you dislike about the product?
We have had issues using MLflow and feature store on Databricks for ML projects, which slows down the development process. Wish there was better documentation on these tools or more diverse examples to demonstrate different use cases. Also, the test-train split with MLflow does not support time series time interval test-train split for model validation purposes.
What problems is the product solving and how is that benefiting you?
The Databricks lakehouse platform allows the data science team better work with the development team in a single platform, which help improve ML project development in the long run.
Good experience so far!
What do you like best about the product?
Great unification of functions & features and data sharing across the organization.
What do you dislike about the product?
There's still a lot to learn and make sure that all the functions I use work well and properly. Nothing bad, just more to find out.
What problems is the product solving and how is that benefiting you?
It's helping me do my job and unifying data sources across all my different work streams.
User friendly and intuitive platform
What do you like best about the product?
As a Cloud Operation Specialist, I deploy the databricks workspace, setup and manage the clusters. It’s easy to setup and manage the users within the workspace.
UI is very user friendly and intuitive.
UI is very user friendly and intuitive.
What do you dislike about the product?
Error messages can me more detailed and explained well.
What problems is the product solving and how is that benefiting you?
Highly efficient in executing queries and analysing data.
showing 221 - 230