Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS Marketplace

2 AWS reviews

External reviews

302 reviews
from G2

External reviews are not included in the AWS star rating for the product.


    Prashidha K.

Very powerful yet easy to use distributed computing and data warehousing platform

  • January 25, 2021
  • Review verified by G2

What do you like best about the product?
Databricks had very powerful distributed computing built in with easy to deploy optimized clusters for spark computations. The notebooks with MLFlow integration makes it easy to use for Analytics and Data Science team yet the underlying APIs and CICD integrations make it very customizable for the Data Engineers to create complex automated data pipelines. Ability to store and query and manipulate massive Spark SQL tables with ACID in Delta Lake makes big data easily accessible to all in the organization.
What do you dislike about the product?
It lacks built in data backup features and ability to restrict data access to specific users. So if anyone accidentally deletes data from Delta Table or DBFS, the lost data cannot be retrieved unless we setup our own customized backup solution.
What problems is the product solving and how is that benefiting you?
I have worked with big data with hundreds of millions of rows using databricks. We do most of the ELT, data cleaning and prepping works on databricks. The ease and speed of querying bid data using databricks SparkSQL is very useful. It is also very easy to create prototype codes utilizing real sized data using the available Python and R notebooks.


    Chad F.

Reduced database network redistributions & run-time of key models by 99+%!

  • August 17, 2020
  • Review provided by G2

What do you like best about the product?
Incidentally, the thing I like most about Databricks isn't a product feature at all; I love Databricks's proactive and customer-centric service, always willing to make an exception or create a unique feature, all the while minimizing costs for the customer - as @Heather Akuiyibo & Shelby Ferson et al. have done for me and my former teams!
What do you dislike about the product?
Broadening programming logic and syntax.
What problems is the product solving and how is that benefiting you?
To name seven (7):

(1) User segmentation using a proprietary variation of a hierarchical DBSCAN clustering algorithm of high-dimensional data with novel distance [quasi] metric, based on hubness analysis;

(2) Leveraging the above in email targeting and invoking multi-armed bandit testing methodologies for email timing, frequency, and content, using decreasing-epsilon strategy;

(3) Modeling predicted underwriting criteria with a binary approval odds classification algorithm;

(4) Using a dynamic panel data, fixed effects model to predict the effect of changes in credit reports on user credit score;

(5) Employing an Autoregressive Integrated Moving Average (ARIMA) with optimized Akaike Information Criterion exploits to predict future revenue and growth (lagged results led to average error bounds of only 5 percent; cross-validation results were even stronger, though I was conservative in guaranteeing 7 percent error, on average);

(6) Refining a multiverse (context-aware) recommendation engine as an n-dimensional tensor (rather than the typical two-dimensional user-item matrix) for partner product recommendations, using High-Order Singular Value Decomposition to solve;

(7) Invoking a Convolutional Neural Network framework with a novel architecture and results of a Fourier Transform as input to classify dental x-rays and highlight to the dentist which teeth require fillings (after approximately two months, the model reached ~95 percent accuracy - in terms of actual agreement by dentists using the app - with F1 score in cross-validation performing on par).
Recommendations to others considering the product:
Be open to the pitch. You may think things are "going fine" or proffer the idea of "if it ain't broke, don't fix it," but these represent short-term thinking traps such that scaling becomes inherently and implicitly constrained and limited. Databricks amounts to the forward-thinking businessperson.


    ianthe L.

How I experienced databricks

  • August 17, 2020
  • Review provided by G2

What do you like best about the product?
It is great when you have large amount of data, excellent for collaboration, perfect for using with visualisation tools and functions with many programming languages.
What do you dislike about the product?
Difficult to get a grasp on how many applications and funcrions it has.
What problems is the product solving and how is that benefiting you?
It s great for ELT of date to use with power BI
Recommendations to others considering the product:
Use it it s the best available and it s great!


    Somu S.

Excellent infrastructure, can scale clusters in no time

  • August 16, 2020
  • Review provided by G2

What do you like best about the product?
Interactive clusters, user friendly, excellent cluster management
What do you dislike about the product?
Cluster takes some time to heat up on start, should support upsert without delta as business need pure upserts too
What problems is the product solving and how is that benefiting you?
Can seemlessly use pyspark, Python to build a robust pipeline
Recommendations to others considering the product:
It's the best infrastructure to build pipelines if you are planning to use spark in production


    Vivek P.

Databricks- Big Data processing tool

  • July 16, 2020
  • Review provided by G2

What do you like best about the product?
Very easy to use. No need to install and setup spark manually.
provides a notebook environment to write code.
support various languages like Python, Spark-SQL, R, Scala, etc.
easy to set up and use.
you can choose the cluster according to your need.
Support Machine Learning flows and Streaming Data.
Automatic suspend cluster if inactive for more than a given time( Cost-cutting)
Auto scalable Cluster.
Optimize uses of clusters (resources)
What do you dislike about the product?
No CI/ CD features given by default.
Costly for small level Enterprise.
Certification cost is high.
What problems is the product solving and how is that benefiting you?
We have to develop pipelines. We are getting data from different sources like AWS S3, redshift and we had to process that large amount of data on Databricks and put it back to our Dataware house.
Recommendations to others considering the product:
Splunk is a best tool when it comes to Big data processing. it is easy to use and setup


    Ramavtar M.

MLFlow: One stop solution for data science model tracking, versioning and deployemet

  • June 23, 2020
  • Review verified by G2

What do you like best about the product?
1) A single format to support all measure ML libraries such as Sklearn, Tensorflow, MXnet, Spark MLlib, Pyspark etc.
2) Capabilities to deploy on Amazon Sagemaker with just one API call
3) Flexibility to log all model params such as Accuracy, Recall, etc. along with Hyperparameter tuning support.
4) A good GUI to compare and select the best models.
5) Model registry to track Staging, Production, and Archived models.
6) Python best API
7) REST APIs supported.
8) Available out of the box in Microsoft Azure.
What do you dislike about the product?
1) CI/CD pipeline is not supported in the open-source version
2) Recent framework so not a very large community
3) Dependent on many python libraries. It can be a problem while resolving dependencies in your existing setup.
What problems is the product solving and how is that benefiting you?
I have used it for managing the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
The same thing can be done in Amazon sagemaker, GCP AI Platform, Microsoft Azure etc. but it would require monthly expenses. It can be good for initial startup data science team.
Recommendations to others considering the product:
It cant be a complete solution for the data science/ML engineering flow. But is essential in the pipeline. It may be used with Apache Airflow to have an end to end ML ops solution. Also, it works best with Amazon sagemaker and Microsoft Azure. However, GCP AI platform support is still in the development phase.
You would also need to take care of CI/CD pipeline for ML models on your own.


    Vikrant B.

Lightening Speed Analytics

  • April 29, 2020
  • Review provided by G2

What do you like best about the product?
DataBricks is a great analytics tool which provides lightening speed analytics and has given new abilities to Data Scientists. Additionally, our advanced analytics at scale has gone up 100 times.
What do you dislike about the product?
The learning curve is steep and people would need coding knowledge to work with Databricks. It can also be costly at times.
What problems is the product solving and how is that benefiting you?
Problems - Analytics problems

Benefits - Scale and Speed


    Alvaro R.

Great tool for distributed programming

  • October 31, 2019
  • Review verified by G2

What do you like best about the product?
The different languages used for implementation.
Great user experience.
Easy to understand and use.
Creation of different tools inside such as clusters or database.
Ease of integration with other software such as azure services.
Great addition to your expertise if you manage to master it completely.
Integration of spark with the different languages.(Python, R, Scala)
What do you dislike about the product?
The documentation inside the portal isn't the best, find better support outside with search engines.
What problems is the product solving and how is that benefiting you?
Currently data transformation as it provides easy access to databases or blobs and the ability to use a language such as python to build up the solution you need is great.
Recommendations to others considering the product:
Great tool for developing when looking for a fast result as it uses distributed programming by the usage of different clusters.


    Internet

Databricks review

  • October 24, 2019
  • Review provided by G2

What do you like best about the product?
1. Good UI
2. Good integrations with other applications/services.
3. Faster and efficient.
4. Updates are good.
What do you dislike about the product?
1. Sometimes it take much time to load the Spark notebook.
2. Sometimes having issues with interpreter settings while running the notebook.
What problems is the product solving and how is that benefiting you?
1. Big data - Analyzing large datasets.


    Douglas D.

Makes building Spark applications a lot easier

  • September 20, 2019
  • Review provided by G2

What do you like best about the product?
It's like a Jupyter notebook but a lot more powerful and flexible. You can easily switch from Python to SQL to Scala from one cell to the next. With the Spark framework, you can preview your data processing tasks without having to build large intermediate tables.
What do you dislike about the product?
Need better support when it comes to troubleshooting spark applications. It shows a lot of information, but gives you little sense of how to apply it
What problems is the product solving and how is that benefiting you?
We do a lot of large scale data processing applications. Previously we used databases, but this is more flexible and powerful (and cheap).
Recommendations to others considering the product:
It's great if you already understand Spark. Otherwise, Spark has quite a learning curve.