Databricks Data Intelligence Platform
Databricks, Inc.External reviews
640 reviews
from
and
External reviews are not included in the AWS star rating for the product.
Powerful platform
What do you like best about the product?
The platform is powerful and flexible enough to do almost anything you want to do, like ETL, ML models, data mining, simple adhoc queries, etc. Also easy to switch languages between python, sql, r, scala, etc. anytime you want.
What do you dislike about the product?
The search function is not my favoriate, I often like to use the search function from the browser but it doesn't work well with scripts in a big cell. Also the clusters takes a while to start.
What problems is the product solving and how is that benefiting you?
It meets all my data mining and data science project needs. Simple and easy to use.
The most flexible and potent data platform available, without a doubt
What do you like best about the product?
The most reliable and user-friendly option for creating ELT pipelines that employ Python, Spark, and SQL is Databricks. Configuring and deploying it doesn't take much labour, and it frees developers from having to worry about setting up the infrastructure.
What do you dislike about the product?
using the same cluster to perform several streaming tasks
Since shutdown immediately following the job run/fail is configured by default, job clusters cannot be reused even for the same retry in PRODUCTION. Checking potential ways to raise this limit.
Since shutdown immediately following the job run/fail is configured by default, job clusters cannot be reused even for the same retry in PRODUCTION. Checking potential ways to raise this limit.
What problems is the product solving and how is that benefiting you?
comprehensive Batch & streaming pipeline
Alps Lake
History and versioning
Delta log transaction with ACID
Validation and quarantine are methods of data curation.
Information Ingestion Using an Autoloader
Alps Lake
History and versioning
Delta log transaction with ACID
Validation and quarantine are methods of data curation.
Information Ingestion Using an Autoloader
Intuitive and Powerful
What do you like best about the product?
As a frequent user of Databricks, it has made my life so much easier by simplifying processes and allowing me to develop proof-of-concept designs rapidly. The orchestration of notebooks via workflows provides excellent visualization and enables me to conduct real-time demos for members on the business side. In addition, the integration with Azure and AWS makes it so that Databricks does not operate in isolation and allows me and other engineering team members to transform large amounts of data that is ingested via our enterprise pipelines.
What do you dislike about the product?
There can sometimes be issues integrating Databricks workflows with open source frameworks, often requiring lots of debugging and trial and error. Additionally, I've been told that the platform can be pretty expensive.
What problems is the product solving and how is that benefiting you?
The Databricks Lakehouse Platform allows me to create and deploy workflows to orchestrate and test proof-of-concept ideas in our organization. This will enable us to validate ideas and develop presentations for the organization's business side.
Excellent solution to unlock data analytics full power
What do you like best about the product?
The infrastructure is pretty straightforward. I started out using the Community edition before switching to the premium version, but if you're a student or working on one-off projects, the Community edition should be more than sufficient.
What do you dislike about the product?
Finding some answers can be challenging at times because there aren't many Pyspark users, forums, or resources available.
What problems is the product solving and how is that benefiting you?
People who are not very proficient in coding can nonetheless gain useful insights from the data utilizing notebooks prepared by data scientists. I've been accustomed to Databricks and creating PySpark programs pretty easily. Databricks have a great ability to manipulate data and perform any desired action.
Excellent for all sorts of data analytics
What do you like best about the product?
The versatility and scalability are the best features for us. We currently use SQL, R, Python, SQL, Spark and Scala with Databricks. It's impressive how seamless this experience is for different teams with different use cases and skill sets. The interoperability across these languages and accessing data is a blessing and enables us to use a vast array of tools to solve problems.
What do you dislike about the product?
More insight into individual job costs would be a helpful feature that is currently lacking. Deploying code is also not as intuitive and the Git integration could be more powerful with an enhanced feature set.
What problems is the product solving and how is that benefiting you?
Easy integration across data sources and using a mixed bag of tools to perform advanced analytics. The mixed bag of tools available also means that we treat this as a one-stop solution and are able to serve analytics outputs to various consumers and stakeholders conveniently.
A lot easier to use than other tools I have used before.
What do you like best about the product?
Python notebooks that abstract away a lot of the complexity e.g. packages and infrastructure.
What do you dislike about the product?
The drop down menu/tool bar in the UI sometimes feels a bit clunky.
What problems is the product solving and how is that benefiting you?
Having multiple data pipelines that are able to be quickly developed and deployed, and then the output data sets made available to clients and internal teams
A great tool that simplifies your infrastructure
What do you like best about the product?
Opportunity to not manage Hadoop clusters.
What do you dislike about the product?
Cluster autoscaling doesn't always work as expected, and I would like to have more control over EC2 instances provisioning (availability to use multiple instance types in a single job/cluster, affinities, possibility to define some sort of topology, etc.). The whole experience is notebook focused.
What problems is the product solving and how is that benefiting you?
Enabling data-driven decisions by analyzing huge amounts of data
Easy, fast and Reliable
What do you like best about the product?
Ease of access to multiple data sources and we can change the code to python to SQL,Scala etc it is impressive.
What do you dislike about the product?
Not able to create interactive visualization
What problems is the product solving and how is that benefiting you?
Major problem solving is of data warehousing with its multilayer data models
Excellent experience so far!!!!
What do you like best about the product?
We employ Python, Spark, and SQL to develop ELT pipelines, and Databricks is the most reliable and user-friendly option available. Developers may concentrate on writing code, creating pipelines, and creating models rather than spending time setting up the environment because it is very simple to do so.
What do you dislike about the product?
Knowledge of the cost model and recommendation engine to reduce burned DBUs There is an Overwatch notebook that offers general statistics about the environment, but it isn't developed enough and it also doesn't show you the cost of the infrastructure used in the back cloud kitchen. Platform as a whole is excellent.
What problems is the product solving and how is that benefiting you?
Two things to mention here.
- It unites all data teams from around the organization on one platform, reducing the need to maintain multiple copies of the same data.
- Because computation and storage are no longer linked, resizing any kind of warehouse environment is no longer necessary to boost compute capacity.
- It unites all data teams from around the organization on one platform, reducing the need to maintain multiple copies of the same data.
- Because computation and storage are no longer linked, resizing any kind of warehouse environment is no longer necessary to boost compute capacity.
Works well in those grey areas of data management.
What do you like best about the product?
Easy to develop and maintain. Flexibility with transactional integrity.
What do you dislike about the product?
Can be more integrated with DW systems like Snowflake.
What problems is the product solving and how is that benefiting you?
Data Summary tables - I work with vast amounts of raw data. Being from the backend team, I do not need to ingest all of the data, but specific parts. Building summary tables by partly processing the data within the lakehouse framework is the most ideal solution I could find.
showing 231 - 240