Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

6 AWS reviews

External reviews

640 reviews
from and

External reviews are not included in the AWS star rating for the product.


    Paras J.

Databricks - A breath of Fresh air in Big Data

  • October 11, 2023
  • Review provided by G2

What do you like best about the product?
The best part about the Databricks Lakehouse platform is the integration of traditional, tried and tested big data technologies with a UI that is welcoming, refined and revolutionary!
What do you dislike about the product?
The older ui for the platform had well separated elements for data science, data engineering and SQL Workspace. In an effort to combine them in the sidebar, the new UI tries too hard and ends up as a laggy and chaotic mess.
What problems is the product solving and how is that benefiting you?
the databricks lakehouse platform brings out the best of open source together, stitched beautifully in a notebook based UI that feels welcoming and way less intimidating than a traditional Spark distributions.
The platform has a solution for every data person, including but not limited to a Notebook that works with Scala, Python, R and SQL, a traditional SQL Editor, downloadable datasets and in house visualisations just a click away!


    Financial Services

So user friendly and a platform to make the organization's data value chain delivering value

  • October 10, 2023
  • Review provided by G2

What do you like best about the product?
unified platform for both BI and AI workload
What do you dislike about the product?
To difficult to keep on track with the evolution pace that platform is growing
What problems is the product solving and how is that benefiting you?
Its helping to realise the paradigm of data-centric AI


    Pranshu G.

Data Lake but combined with Datawarehouse benefits

  • October 07, 2023
  • Review provided by G2

What do you like best about the product?
It offers ACID transactions which is a massive suppport for data consistency, along with this, the leveraging features such as Time travel and schema evolution comes real handy while builidng a scalable solution. In addition of all above,it reduce data storage costs all while not compromising on powerful distributed programming.
What do you dislike about the product?
With all the features combines, it truly is a powerful tool however, it can be a real challange for new users to master it. For BI users, analysts, who arent skilled with programming, may find it difficult to understand the workflow. Moreover, the community for this tool is currently relatively small and hence minimzing community support.
What problems is the product solving and how is that benefiting you?
The business requires to keep to update powerBI dashboard reports which are ever increasing day by day. As a solution, we are utilizing the lakehouse's ACID, and features such as Schema evolution to clean and transform data and build BI dashboard using the now cleaned data.
This solution has eliminated dependency on our already saturated datawarehouse resources. This has also helped in debugging as all data is processed and resides in one place.Last but not the least, this has reduced costs of our datawarehouse by 20%


    Amulya S.

Databricks: a perfect data platform for python users

  • October 05, 2023
  • Review provided by G2

What do you like best about the product?
The UI is build keeping ML and python users in mind, it is very intuitive to use.
What do you dislike about the product?
The speed of processing is slow and could be improved
What problems is the product solving and how is that benefiting you?
Easy integration with python notebooks and big data. The pipeline got much efficient.


    Senthil K.

Databricks Unleashed - Unlocking Data Insights and Streamlining Analytics with Databricks

  • October 03, 2023
  • Review provided by G2

What do you like best about the product?
Unified analytics platform for batch & stream data processing
Auto loader, schema evolution capabilities with CDC usage
Delta Live table Serverless Pipelines
Data Quality expectations
Databricks workflows
Databricks SQL warehouse - Photon SQL endpoints
Unity Catalog for data governance & security
Ease of use with partner connect & integartions
What do you dislike about the product?
cost is more when we use allpurpose cluster compared to DLT pipeline which won't suite for all use-cases

vendor lock-in if we use more databricks specific delta features

Learning curve for pyspark related stuff not for SQL coding
What problems is the product solving and how is that benefiting you?
Unified lakehouse platform for batch & stream processing
Building the catalog for centralized goverannce
Workflow orchestration
Integrations with cloud & data storage layers
Data sharing with external customer through delta sharing & marketplace


    Filippo C.

A complete platform for data science and engineering

  • September 19, 2023
  • Review provided by G2

What do you like best about the product?
Cluster creation is now made easy through a simple configuration page.
Workspace allows you to organise all your notebooks in one place.
Job mode allows to plan notebook execution and to plan dev/prod pipelines.
What do you dislike about the product?
Data visualization of notebooks output cells is basic, even if it is good for simple application. Dashboard section could be improved by increasing clarity. These are however minor complaints.
What problems is the product solving and how is that benefiting you?
Databricks is helping me saving time when developing code and running jobs at given datetimes.
The autocomplete tool is very efficient, specially when dealing with very long codes and installing python packages or java library is no longer a problem.


    Santosh M.

Databricks Lakehouse

  • September 08, 2023
  • Review provided by G2

What do you like best about the product?
Its awesome data warehouses platform to help to extract the data or metadata from data lakehouse. Data tables it help is to build the Ai/ML models.
What do you dislike about the product?
all services are good nothing to dislike.
What problems is the product solving and how is that benefiting you?
it's solve the Data Science and Machnie learning problems


    Financial Services

A Tool Box to the Modern Big Data Data Scientist

  • September 05, 2023
  • Review provided by G2

What do you like best about the product?
The upscale in storing and retrieving large quantities of data with its sdk to s3. In addition, great resources allocation support and additional tools such as clearml.
What do you dislike about the product?
The compatibility to pandas is lacking due to the fact that it is mainly used by me with pyspark which didnt allow an optimal usage for the various pandas libraries.
What problems is the product solving and how is that benefiting you?
Retrieving and querying a very large data warehouse on s3 (several hunders of T'). Performing basic filtering and quering on the data and running a ML experiment on huge amounts of data.


    Felix V.

Great tool for data exploration and development, no so much for production pipelines

  • August 23, 2023
  • Review provided by G2

What do you like best about the product?
Easy to set up processes and iterate.
Shareability
What do you dislike about the product?
Not tailored for production integration
Hard to incorporate without being databricks aware, which leads to a vendor lock
What problems is the product solving and how is that benefiting you?
Gaining data visibility
Developing spark jobs towards production


    Nabil Fegaiere1

A powerful solution that is easily integrated into a variety of platforms

  • August 21, 2023
  • Review provided by PeerSpot

What is our primary use case?

I am a Databricks service partner, and my customers use Azure Databricks and Data Factory.

What is most valuable?

It's very simple to use Databricks Apache Spark. It's really good for parallel execution to scale up the workload. In this context, the usage is more about virtual machines.

Using meta-stores like Hive was optional, and the solution is good for data science use cases. With the Authenticator Log, Databricks is good for data transformation and BI usage. We have a platform.

What needs improvement?

I would like more integration with SQL for using data in different workspaces. We use the user interface for some functionalities, while for others, we have to use SQL to create data sets and grant permissions. For example, when creating a cluster, we have to create it with some API or user interface. Creating a cluster with some properties using SQL grants the possibility of using SQL syntax. Integration with SQL will make Databricks easier to use by people who have experience with databases like Lakehouse, and they would be able to use the data lake and BI. More integration will help have one point of view for everyone using SQL syntax.

Integration with Kubernetes could also be good for minimizing the price because you can use Kubernetes instead of virtual machines. But that won't be easy.

For how long have I used the solution?

I have worked with the solution for four or five years, with some experience since 2016.

What do I think about the stability of the solution?

The solution is stable. The only problem with stability would be that people are not using it efficiently.

What do I think about the scalability of the solution?

The solution is good for scalability.

How was the initial setup?

When we have administration experience, the solution is not difficult to deploy. Technically, however, it's difficult because governance is more complex. For example, I have two warehouses on Databricks, which are clusters in this workspace, and we have to switch from workspace to workspace to have all this information. There is a system table that has all this, but I don't know if everyone can use these tables.

What's my experience with pricing, setup cost, and licensing?

Databricks are not costly when compared with other solutions' prices.

Which other solutions did I evaluate?

Databricks's functionalities are as good as solutions like Snowflake, BigQuery, and Redshift.

What other advice do I have?

People sometimes do not use the solution efficiently. They misunderstand databases, the usage of tables, and the performance. Many data engineers are very junior and don't have skills in that. Stability is more a customer problem than a problem with the product itself. One possible problem with the product is that there's no method to pause the usage of something. For example, we have to use the meta server or the data catalog in Synapse. But in Databricks, we have a choice to use a catalog or not, or Hive, which is always integrated, but we have to choose whether to use it or not. Many customers directly use the passes on Databricks, which causes performance and governance problems.

I can offer a lot of advice on Databricks, and one is to use meta stores like Unity Catalog or Hive Metastore. For incoming use cases, it's better to use Unity Catalog.

I rate Databricks a nine out of ten.