Sign in
Categories
Your Saved List Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Reviews from AWS customer

6 AWS reviews

External reviews

640 reviews
from and

External reviews are not included in the AWS star rating for the product.


    Axel Richier

Simple to set up, fast to deploy, and with regular product updates

  • November 07, 2022
  • Review provided by PeerSpot

What is our primary use case?

We're using it to provide a unified development experience for all our data experts, including all data engineers, data scientists, and IT engineers. With the Databrick Platform we allows teams to collaborate easily towards building Data Science models for our clients. The development environment allows us to ingest data from various data sources, scale the data processing and expose them either trough API or through enriched datasets made available to web app or dashboard leveraging the serverless capacities of SQL warehouse endpoints.

How has it helped my organization?

Databricks allowed us to offer an homogeneous development environment accross different accounts and domains, and also across different clouds. The upskilling of our employees is far more linear and faster, while removing the complexity of infrastructure management. This lead to an increased collaboration between domain thanks to a better onboarding experience, more performant pipelines and a smoother industrialization process. Overall client satisfaction has increased and the time to first insight has been reduced.

What is most valuable?

The shared experience of collaborative notebooks is probably the most useful aspect since, as an expert, it allows me to help my juniors debug their books and their code live. I can do some live coding with them or help them find the errors very efficiently.

It has become very simple to set up thanks to its official Terraform provider and the open-source modules made available on GitHub.

I love Databricks due to the fact that we can now deploy it in 15 minutes and it's ready to use. That's very nice since we often help our clients in deploying their first Data Platform with Databricks.

The solution is stable, with LTS Runtimes that have proven to remain stable over the years. 

What needs improvement?

I would love to be able to declare my workflows as-code, in an Airflow-like way. This would help creating more robust ingestion python modules we can test, share and update within the company. 

We would also love to have access to cluster metrics in a programmatic way, so that we can analyse hardware logs and identify potential bottlenecks to optimize.

Lastly, the latest VS Code extension has proven to be useful and appreciated by the community, as it allows to develop locally and benefits from traditional software best-practices tools like pre-commits for example.

For how long have I used the solution?

I've been using the solution for more than four years now, in the context of PoC to full end-to-end Data Platform deployment.

What do I think about the stability of the solution?

The product is very stable. I've been using it for three years now, and I have projects that have been running for three years without any big issues.

What do I think about the scalability of the solution?

It's very scalable. I have a project that started as a proof of concept on connected cars. We had 100 cars to track at first - just for the proof of concept. Now we have millions of cars that are being tracked. It scales very well. We have terabytes of data every day and it doesn't even flinch.

How are customer service and support?

I've had very good experiences with technical support where they answer me in a couple of hours. Sometimes it takes a bit longer. It's usually a matter of days, so it's very good overall. 

Even if it took a bit of time, I got my answer. They never left me without an answer or a solution.

How would you rate customer service and support?

Positive

How was the initial setup?

The implementation is very simple to set up. That's why we choose it over many other tools. Its Terraform provider is our way-to-go for the initial setup has we are reusing templates to get a functional workspace in minutes.

Usually, we have two to five data engineers handling the maintenance and running of our solutions.

What about the implementation team?

We deploy it in-house.

What's my experience with pricing, setup cost, and licensing?

The solution is a bit expensive. That said, it's worth it. I see it as an Apple product. For example, the iPhone is very expensive, yet you get what you pay for.

The cost depends on the size of your data. If you have lots of data, it's going to be more expensive since your paper compute units will be more. My smallest project is around a hundred euros, and my most expensive is just under a thousand euros a week. That is based on terabytes of data processed each month.

Which other solutions did I evaluate?

We looked into Azure Synapse as an alternative, as well as Azure ML and Vertex on GCP. Vertex AI would be the main alternative.

Some people consider Snowflake a competitor; however, we can't deploy Snowflake ourselves just like we deploy Databricks ourselves. We use that as an advantage when we sell Databricks to our clients. We say, "If you go with us, we are going to deploy Databricks in your environment in 15 minutes," and they really like it.

Lately Fabric was released and can offer quite a similar product as Databricks. Yet, the user experience, the CI/CD capabilities and the frequent release cycle of Databricks remains a strong advantage.

What other advice do I have?

We're a partner.

We use the solution on various clouds. Mostly it is Aure. However, we also have Google and AWS as well. 

One of the big advantages is that it works across domains. I'm responsible for a data engineering team. However, I work on the same platform with data scientists, and I'm very close to my IT team, who is in charge of the data access and data access control, and they can manage all the accesses from one point to all the data assets. It's very useful for me as a data engineer. I'm sure that my IT director would say it's very useful for him too. They managed to build a solution that can very easily cross responsibilities. It unifies all the challenges in one place and solves them all mostly.

I'd rate the solution nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure


    Hospital & Health Care

It's really useful for big datasets

  • October 15, 2022
  • Review provided by G2

What do you like best about the product?
Lesser Running time, handling big datsets, user-friendly platform
What do you dislike about the product?
Cluster active time is less, active time should be increased when not in use
What problems is the product solving and how is that benefiting you?
Helps in Data Warehousing on big datasets, building data engineering pipeline and ML models end to end


    Prashant S.

Make process easy for big data transformation

  • July 30, 2022
  • Review provided by G2

What do you like best about the product?
It's stores data in delta lake that basically helps to generate a backup of data doesn't matter if process failed every time it took cache of data and most importantly we can easily migrate it with any cloud platform to handle big data.
What do you dislike about the product?
For dislike I would say some time cluster takes time to run and it gives memory error and it's bit costly in use and sometime notebooks cells stuck in between run so team can work on it bit.
What problems is the product solving and how is that benefiting you?
I use it mainly in azure dala lake connection it can easily connect with storage and by applying transformation we can easily push tha data. Most important I use pyspark and sql spark in one notebook meaning we can write script whatever language we want. I use it for etl flow by connecting with azure data factory.


    Ahmed H.

Databricks lakehouse

  • July 20, 2022
  • Review provided by G2

What do you like best about the product?
Everything is on a single platform like ETL, Sql dashboard and running ML models.
Simplied version for creating scheduled jobs using workshops and the best part is Delta Lake.
What do you dislike about the product?
Every piece of code should be in the form of notebooks which sometimes makes it difficult to manage. It can be more user friendly if they give different options.
What problems is the product solving and how is that benefiting you?
Time travel feature allows to version database instead of keeping redundant or replicas. This has optimized both in terms of human efforts and cost.


    Laura E.

Lakehouse is the Marriage of cloud based DW and Data Lake

  • July 15, 2022
  • Review provided by G2

What do you like best about the product?
It is a cloud native modern data estate service which handles core DW concepts around competing with snowflake and delta live table schema requirements like CDC like a champ
What do you dislike about the product?
Lakehouses are great but not the answer to everything when it comes to all the needs of cloud scale analytics and AI
What problems is the product solving and how is that benefiting you?
Lakehouse model enables you to work in a truly unified architectyou are that provides highly performant and structured streaming capabilities, and most importantly for me, data science, machine learning and Visualization capabilities.


    Neelakanta P.

Databricks Lakehouse is one shop stop any bigdata analytics

  • July 09, 2022
  • Review provided by G2

What do you like best about the product?
Databricks lakehouse is one shop stop for analytics with Big data case
What do you dislike about the product?
Databricks have many releases one going and that might create a need for customer to constantly updates there infrastructure
What problems is the product solving and how is that benefiting you?
It helps optimise Spark engine by building a wrapper compute on Spark cluster and this help run huge volume based queries much faster


    Mayur S.

One Stop Shop for Data Engineers

  • July 04, 2022
  • Review provided by G2

What do you like best about the product?
One platform to access Notebooks, tables, AI/Ml Platform
What do you dislike about the product?
No debugger like other IDE's, difficult to navigate notebooks and functions
What problems is the product solving and how is that benefiting you?
Formed Datalake to reduce efforts and time for internal teams


    Ahmed M.

Best conference in data and analytics now

  • July 04, 2022
  • Review provided by G2

What do you like best about the product?
The extensive detailed content that is not shy from being deeply technical and in the same time industry-focused to depth. The amount of information covered here is incredible from data & analytics to GPUs to K8s to industry discussions
What do you dislike about the product?
there is no on-demand hands-on labs. and we cannot download all slides for all sessions. The timing of the sessions was also a challenge.
I didn't like the scheduling features too.
What problems is the product solving and how is that benefiting you?
Mainly it solves the ability to ask questions against the data regardless of their nature (streaming and batch); without the need to move the data around to other platforms etc.
Recommendations to others considering the product:
Follow the guidance


    Michael L.

Fantastic Data Engineering and Data Science platform

  • July 01, 2022
  • Review provided by G2

What do you like best about the product?
Best Data Engineering features. Love it.
What do you dislike about the product?
Very expensive. Wish it would cost less.
What problems is the product solving and how is that benefiting you?
It is solving data ingestion problems and dataset preparation problems. This is benefitting me by making automated Data Engineering easy to implement by myself.


    Pradeep S.

Build fast and reliable data pipelines batch/streaming supporting unstructured and structured data

  • June 30, 2022
  • Review provided by G2

What do you like best about the product?
Data Governance and Simplified Schema.
Support for unstructured along with structured data enabling support for any use cases to build machine learning, business intelligence, and streaming features. Also support Streaming Live Tables which is a new feature in latest version.
What do you dislike about the product?
performance benchmark needs to be verified with other competitors like Snowflake. Looks like(as per the documentation) the latest version is blazing fast.
What problems is the product solving and how is that benefiting you?
Unstructured and Structured data in a unified repository to build Machine Learning and BI capabilities.
Recommendations to others considering the product:
How simple to deliver BigData applications. Hassle-free administration and maintenance.