External reviews
External reviews are not included in the AWS star rating for the product.
Great platform for collaboration and data analysis
What do you like best about the product?
Integration with Github repos and CI/CD pipelines. Also having different ways to collaborate with team members and stakeholders (repos, workspace, Databricks SQL)
What do you dislike about the product?
Depending on cluster settings and number of users running queries at the same time and the number of jobs running at the same time, it can sometimes take time to run queries
What problems is the product solving and how is that benefiting you?
Having a single place to store data and being able to merge it and analyze it together is useful to enable insights for the decision-making roles as well as operations teams
- Leave a Comment |
- Mark review as helpful
Best Lakehouse Platform for building enterprise data pipelines for business needs
What do you like best about the product?
No 1 - Delta Lakehouse platform supports ACID transactions (Data lake + Datawarehouse)
Easy DLT pipeline with lineage & quality
Unified governance with the unity catalog
Support Schema evolution
Exceptional AUTOLOADER capability
Easy DLT pipeline with lineage & quality
Unified governance with the unity catalog
Support Schema evolution
Exceptional AUTOLOADER capability
What do you dislike about the product?
Awaiting for the Serverless Data engineering pipeline with NO capacity planning outside DLT with SLA-based scaling ( I know it's on ROADMAP, I am waiting).
More features on GCP+Databricks integration compared to same as AWS, Azure. (Some capabilities like credential passthrough missing in GCP)
More features on GCP+Databricks integration compared to same as AWS, Azure. (Some capabilities like credential passthrough missing in GCP)
What problems is the product solving and how is that benefiting you?
Data Lake + Datawarehousing (Unifies Lakehouse Platform)
Delta lake capabilities
Schema evolution
Data quarantine & Data Quality
Data Integration & Transformations
Delta lake capabilities
Schema evolution
Data quarantine & Data Quality
Data Integration & Transformations
Recommendations to others considering the product:
Kindly go for this for building a cloud-native lakehouse platform for big data batch/streaming ingestion, quality, transformations and building the medallion lakehouse architecture (unified data lake + Datawarehouse) data mesh experience for end consumers. Best in the market which supports AWS,AZURE and GCP cloud.
Partner Connect, Advanced analytics/MLOPS/Data science Auto-ML also looks good with improving salient features.Go for this product which combines all in one suite
Data Sharing (Delta Sharing) is quite useful for security/compliance
Partner Connect, Advanced analytics/MLOPS/Data science Auto-ML also looks good with improving salient features.Go for this product which combines all in one suite
Data Sharing (Delta Sharing) is quite useful for security/compliance
easy to use platform for large scale data ETL and analytics
What do you like best about the product?
Has tools like AutoML which reduces human effort and increases better predictions and deeper understanding of the data
What do you dislike about the product?
The platform can be slow sometimes. Other than that not major issues worth mentioning
What problems is the product solving and how is that benefiting you?
Analysis and Analytics. Use case - Labour market research
The best platform for building the future
What do you like best about the product?
1. The core storage technology is Open Source (Delta Lake)
2. Multiple data formats fully accessible via Spark/Python or SQL
3. Ability to manage code via our own GitHub repositories
2. Multiple data formats fully accessible via Spark/Python or SQL
3. Ability to manage code via our own GitHub repositories
What do you dislike about the product?
1. Not always obvious which pieces are (or will be) open source vs proprietary
2. GitHub integration doesn't support multiple branches, making it difficult to develop alongside production
3. Hard mode-switch between SQL and Data Science user interfaces feels needlessly complex (though I understand there is some technical justification for it)
2. GitHub integration doesn't support multiple branches, making it difficult to develop alongside production
3. Hard mode-switch between SQL and Data Science user interfaces feels needlessly complex (though I understand there is some technical justification for it)
What problems is the product solving and how is that benefiting you?
1. THE single source of truth for all our enterprise data, including Salesforce and NetSuite
2. Straightforward integration of our business data with our IoT product data
3. Elegant console (using Quilt Data Smart Reports) for presenting that data to multiple stakeholders
2. Straightforward integration of our business data with our IoT product data
3. Elegant console (using Quilt Data Smart Reports) for presenting that data to multiple stakeholders
Recommendations to others considering the product:
Decide up front how much you want to take advantage of their proprietary technology (such as live tables), versus industry standards such as Spark, SQL, and dbt. There's no right answer, but the more mindful you are about those tradeoffs the fewer regrets you will have down the road.
Going in the right direction but might take a while. Best platform to bet on
What do you like best about the product?
Easy to use and and very small learning curve. This makes it easy to start focusing on the actual probelm statement and start getting value out of it.
What do you dislike about the product?
UX. Though there are features available, sometime it's hard to find. If you're not trained, your eye might not catch it. Some features can only be applied via API. This require to keep a constant watch in the documentation to know what other options available. At least those options could be provided as a note in the UI for knowing there are other possibilities.
What problems is the product solving and how is that benefiting you?
We are building a data platform using lakehouse to empower the whole organisation to take data driven decision. Databricks providing us a platform to move fast without thinking much about infrastructure. We could easily scale. At the same time it is almost like open source. There is very little vendor lock-in risk.
Recommendations to others considering the product:
Use it to move fast! It cost a bit on the higher side. As it is built on top of open source, there are plenty of options to move out at a later point when you're mature if you are worried too much about the cost or just continue using it if you don't want the management overhead and the addon performance benifits.
Great Experience All Around
What do you like best about the product?
A great experience that combines ML-Runtimes - MLFlow and Spark. The ability to use Python, and SQL seamlessly in one platform. Since databricks notebooks can be saved as python scripts in the background it is amazing to have both notebook and script experience and synchronize to git.
What do you dislike about the product?
Debugging code and using interactive applications outside out databricks approved tools can be tricky. It is hard to get a grasp of the documentation for beginners to the platform.
What problems is the product solving and how is that benefiting you?
Highly scalable data pipelines with machine learning tools. Geospatial analyses. The scalability of the platform really increased our efficiency and reaction speed to customer requirements.
Lakehouse made simple
What do you like best about the product?
For me, I like the data science and SQL platform best. They are extremely helpful for my job, allowing me to streamline my work and automate it using Jobs.
What do you dislike about the product?
Sometimes the platform can be a bit slow to react but I'm not sure if it's the cluster size or something is wrong with Databricks itself. Overall I didn't find many issues with the platform.
What problems is the product solving and how is that benefiting you?
I'm solving automation issues along with ETL work. I'm able to use Databricks with our datalake in the easiest possible way, using different data formats with ease.
Lakehouse brings the best of the data warehousing and data lake worlds under one solution
What do you like best about the product?
The slowly changing dimension features that comes out of the box with the lakehouse
What do you dislike about the product?
Lake of UI/UX to have a moth user experience
What problems is the product solving and how is that benefiting you?
It solves a lot of the standard engineering challenges when it comes to ingestion of data, data lineage and access management
High performance with low complexity
What do you like best about the product?
The way that works with the delta format. Providing a lot of possibilities without the necessity to have a dedicated database administrator. It's also nice to talk about their flexibility
What do you dislike about the product?
Sometimes the cluster management is not so well distributed, causing some necessity to restart the cluster. Maybe send some warnings before it gets non workable.
What problems is the product solving and how is that benefiting you?
Problems: Increasing data results in almost a logarithmic increase in the cost. The main benefits were regarding the costs and the flexibility that's delta format provide
Amazing product which answers data people's persona
What do you like best about the product?
How a steep learning curve Databricks is, I'm enjoying learning all the time with such great materials and people.
What do you dislike about the product?
Sometimes I feel documentation is a bit misleading. E.g. Pandas UDF + ML model combined - the functionality of both is amazing, but actually it's not clear how to use it.
What problems is the product solving and how is that benefiting you?
I never understood the motivation of having lakehouse instead of bunch of parquets, plus not sure what happens behind the scenes. Whenever implementing a new product, one must know how and why to do it :)
showing 291 - 300