Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

6 AWS reviews

External reviews

637 reviews
from and

External reviews are not included in the AWS star rating for the product.


4-star reviews ( Show all reviews )

    Gitesh K.

Great experience. It helped us create/deploy custom models on sagemaker.

  • June 22, 2021
  • Review provided by G2

What do you like best about the product?
The features. There are hell lot of features present on MLFlow
What do you dislike about the product?
Haven't explored everything. But yeah, maybe better documentation.
What problems is the product solving and how is that benefiting you?
Solving platform issues of aws sagemaker. We deployed a custom model which used to power ranking algorithm on the listing page.


    Information Technology and Services

Good for code collaboration

  • June 20, 2021
  • Review provided by G2

What do you like best about the product?
Multiple people can write code in the same file at the same time. We use Databricks in Machine Learning.
What do you dislike about the product?
The cluster gets shut down after sometimes, leading to loss of data on the RAM
What problems is the product solving and how is that benefiting you?
Code collaboration, Machine Learning


    Automotive

It was fantastic for all the integrations it had. Custom ml transform support could have been better

  • June 17, 2021
  • Review provided by G2

What do you like best about the product?
The fact that you can store all your models at one place
What do you dislike about the product?
The support for custom transforms isn't there
What problems is the product solving and how is that benefiting you?
We wanted a place to store all models at one place with versioning. Many benefits like serving the model are offered out of the box through databricks
Recommendations to others considering the product:
If you want to store all your ml models in one place then it's the way to go.


    Jorge C.

Databricks is the best option for your data workloads and pipelines

  • May 26, 2021
  • Review provided by G2

What do you like best about the product?
It is a highly adaptable solution for data engineering, data science, and AI
What do you dislike about the product?
I wouldn't say I like the lack of an easier way to import personalized code files or libraries from notebooks.
What problems is the product solving and how is that benefiting you?
I've solved emergency telephone data processing and insights. The performance of the solution is desirable.


    Debashis P.

Senior Cloud Evangelist and Architect

  • May 26, 2021
  • Review provided by G2

What do you like best about the product?
Spark Distribution of query and speed of batch query so does performance
What do you dislike about the product?
Interface can be make better and more intutive
What problems is the product solving and how is that benefiting you?
Big Batch bulk Parallel programming


    Stephen D.

Staging data for insights

  • May 26, 2021
  • Review provided by G2

What do you like best about the product?
It makes the power of Spark accessible and innovative solutions like Delta Lake.
What do you dislike about the product?
Fewer solutions that aren't wholly or partially on the cloud.
What problems is the product solving and how is that benefiting you?
We are staging large datasets for reporting and multiple BI solutions.


    Deepa Ram S.

Best tool for big data

  • May 20, 2021
  • Review provided by G2

What do you like best about the product?
Easy to use multiple languages based command in same notebook. Direct connection to Redshift.
What do you dislike about the product?
Sometime it takes lot of time to load data. Should show better suggestions.
What problems is the product solving and how is that benefiting you?
We are using databricks to analyse big data and get business insights.


    Prashidha K.

Very powerful yet easy to use distributed computing and data warehousing platform

  • January 25, 2021
  • Review provided by G2

What do you like best about the product?
Databricks had very powerful distributed computing built in with easy to deploy optimized clusters for spark computations. The notebooks with MLFlow integration makes it easy to use for Analytics and Data Science team yet the underlying APIs and CICD integrations make it very customizable for the Data Engineers to create complex automated data pipelines. Ability to store and query and manipulate massive Spark SQL tables with ACID in Delta Lake makes big data easily accessible to all in the organization.
What do you dislike about the product?
It lacks built in data backup features and ability to restrict data access to specific users. So if anyone accidentally deletes data from Delta Table or DBFS, the lost data cannot be retrieved unless we setup our own customized backup solution.
What problems is the product solving and how is that benefiting you?
I have worked with big data with hundreds of millions of rows using databricks. We do most of the ELT, data cleaning and prepping works on databricks. The ease and speed of querying bid data using databricks SparkSQL is very useful. It is also very easy to create prototype codes utilizing real sized data using the available Python and R notebooks.


    Vivek P.

Databricks- Big Data processing tool

  • July 16, 2020
  • Review provided by G2

What do you like best about the product?
Very easy to use. No need to install and setup spark manually.
provides a notebook environment to write code.
support various languages like Python, Spark-SQL, R, Scala, etc.
easy to set up and use.
you can choose the cluster according to your need.
Support Machine Learning flows and Streaming Data.
Automatic suspend cluster if inactive for more than a given time( Cost-cutting)
Auto scalable Cluster.
Optimize uses of clusters (resources)
What do you dislike about the product?
No CI/ CD features given by default.
Costly for small level Enterprise.
Certification cost is high.
What problems is the product solving and how is that benefiting you?
We have to develop pipelines. We are getting data from different sources like AWS S3, redshift and we had to process that large amount of data on Databricks and put it back to our Dataware house.
Recommendations to others considering the product:
Splunk is a best tool when it comes to Big data processing. it is easy to use and setup


    Ramavtar M.

MLFlow: One stop solution for data science model tracking, versioning and deployemet

  • June 23, 2020
  • Review provided by G2

What do you like best about the product?
1) A single format to support all measure ML libraries such as Sklearn, Tensorflow, MXnet, Spark MLlib, Pyspark etc.
2) Capabilities to deploy on Amazon Sagemaker with just one API call
3) Flexibility to log all model params such as Accuracy, Recall, etc. along with Hyperparameter tuning support.
4) A good GUI to compare and select the best models.
5) Model registry to track Staging, Production, and Archived models.
6) Python best API
7) REST APIs supported.
8) Available out of the box in Microsoft Azure.
What do you dislike about the product?
1) CI/CD pipeline is not supported in the open-source version
2) Recent framework so not a very large community
3) Dependent on many python libraries. It can be a problem while resolving dependencies in your existing setup.
What problems is the product solving and how is that benefiting you?
I have used it for managing the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry.
The same thing can be done in Amazon sagemaker, GCP AI Platform, Microsoft Azure etc. but it would require monthly expenses. It can be good for initial startup data science team.
Recommendations to others considering the product:
It cant be a complete solution for the data science/ML engineering flow. But is essential in the pipeline. It may be used with Apache Airflow to have an end to end ML ops solution. Also, it works best with Amazon sagemaker and Microsoft Azure. However, GCP AI platform support is still in the development phase.
You would also need to take care of CI/CD pipeline for ML models on your own.